TOP ACTIVATIONS
MAX = 0.083

butterfly back in its natural habitat . He waved goodbye to
the pit and started to gn aw on a bone .
root , larger and more gn ar led than the others
animals stay in their natural habitat and not take them away
children put on their bath ro bes and ran around in
His mom said , " Why don 't you invite your
away in her pig gy bank every week . This way
mom said . " But why are you so dirty ?"
said . " That 's why it is so important to
mom asked him , " Why are you sad , Tim
said , " That is why we need to brush off
His mom asked , " Why do you refuse to play
see animals in their natural habitat . She ran around the
His mom said , " Why do you want to bury
mom asked her , " Why are you sad , Lily
asked his mom , " Why can 't I whistle ?"
's mom said , " Why don 't you cl ap
my asked her , " Why are you upset , Lily
. Her mom asked her why she was crying and Lily
The mother asked : " Why are you sad ?"

MIN ACTIVATIONS
MIN = -0.083

. They hoped the k ang aroo would forgive them and
shake . Everywhere the k ang aroo went , change happened
little boys ," the k ang aroo says . " I
be careful . The k ang aroo is not a toy
ang aroo !" The k ang aroo sque aked in response
Hello friend !" The k ang aroo sque aked happily in
he could do anything if he set his mind to it
says . " The k ang aroo is not from the
, " What does ' ang ry ' mean ?".
a story ," the k ang aroo says . " Come
ang aroo !". The k ang aroo stopped jumping and hopped
ang aroo . The k ang aroo eats them . It
They hoped the baby k ang aroo would be okay .
ang aroo . The k ang aroo looks at them .
y um ," the k ang aroo says .
It became the happiest books he lf in the world !
ang aroo . The k ang aroo watches them . He
me first ". The k ang aroo promised he would ,
?" The k ang aroo nods . It hops
She said , " H ang up please !" Her mother

INTERVAL 0.041 - 0.083
CONTAINS 0.266%

found a teaspoon in her kitchen drawer and she was so
's okay , I understand why you did it . But
was so glad to understand why everything looked like such a
the grapes out on her kitchen table , so she could
her mom said . " What kind of story do you

INTERVAL -0.000 - 0.041
CONTAINS 48.758%

play with him . Max was sad , but he had
S ara was scared and hurt . She cried
and put it back in the drawer .
home . And that 's how Will y found his ancient
playful pup . He likes to chase things . Are you

INTERVAL -0.041 - -0.000
CONTAINS 50.860%

We restored the old house !" They hugged and laughed together
you want to do with it ?". Alice grinned
and licking his leg . " Rex , are
" Hey ! Come back here !" Sarah yelled .
hospital . The little girl had to stay in the hospital

INTERVAL -0.083 - -0.041
CONTAINS 0.116%

could think of . Everywhere she went , she brought a
, " Where are you all ?" He waited
ruler to measure how big it was . The sw an
help you make a new one ? I know a good
something special , something no one else had . So one
Token butterfly
Token ID35113
Feature activation+1.23e-02

Pos logit contributions
apped+0.022
enge+0.022
umbled+0.021
amed+0.021
ay+0.020
Neg logit contributions
 tale-0.013
land-0.013
 tul-0.013
eling-0.013
 gn-0.013
Token back
Token ID736
Feature activation-2.40e-02

Pos logit contributions
eling+0.020
land+0.019
itive+0.019
 tale+0.019
 lav+0.018
Neg logit contributions
apped-0.017
enge-0.016
amed-0.016
ay-0.016
umbled-0.015
Token in
Token ID287
Feature activation+1.54e-02

Pos logit contributions
apped+0.039
enge+0.039
ay+0.037
amed+0.037
umbled+0.036
Neg logit contributions
land-0.033
eling-0.032
itive-0.031
 tale-0.031
 why-0.031
Token its
Token ID663
Feature activation+2.50e-02

Pos logit contributions
eling+0.019
land+0.018
 tale+0.018
itive+0.018
 hesitation+0.018
Neg logit contributions
apped-0.027
enge-0.026
amed-0.026
umbled-0.026
ay-0.025
Token natural
Token ID3288
Feature activation+1.96e-02

Pos logit contributions
 tale+0.032
land+0.032
 tul+0.031
 lav+0.031
 gn+0.030
Neg logit contributions
apped-0.043
enge-0.043
umbled-0.042
ay-0.040
ane-0.039
Token habitat
Token ID20018
Feature activation+8.06e-02

Pos logit contributions
land+0.027
eling+0.027
 tale+0.026
itive+0.026
 tul+0.025
Neg logit contributions
apped-0.032
enge-0.031
amed-0.029
umbled-0.029
ay-0.029
Token.
Token ID13
Feature activation-3.78e-03

Pos logit contributions
eling+0.070
 tale+0.064
itive+0.063
land+0.062
 lav+0.061
Neg logit contributions
apped-0.173
ay-0.166
enge-0.165
amed-0.164
umbled-0.163
Token He
Token ID679
Feature activation+4.56e-03

Pos logit contributions
apped+0.006
enge+0.006
amed+0.006
umbled+0.006
ay+0.005
Neg logit contributions
land-0.005
eling-0.005
 why-0.005
 gn-0.005
 scr-0.005
Token waved
Token ID26834
Feature activation-1.89e-02

Pos logit contributions
eling+0.006
land+0.006
 scr+0.005
 tale+0.005
 gn+0.005
Neg logit contributions
enge-0.008
apped-0.008
ay-0.008
umbled-0.008
amed-0.007
Token goodbye
Token ID24829
Feature activation-1.35e-02

Pos logit contributions
enge+0.033
apped+0.032
umbled+0.032
ase+0.032
amed+0.031
Neg logit contributions
land-0.023
 why-0.022
 hesitation-0.021
itive-0.021
 gn-0.021
Token to
Token ID284
Feature activation+5.49e-03

Pos logit contributions
apped+0.020
enge+0.018
amed+0.018
ay+0.018
umbled+0.018
Neg logit contributions
eling-0.022
itive-0.020
 tale-0.020
land-0.020
 lav-0.020
Token the
Token ID262
Feature activation+8.18e-03

Pos logit contributions
apped+0.010
ay+0.010
enge+0.010
amed+0.009
umbled+0.009
Neg logit contributions
land-0.010
eling-0.010
 tale-0.010
itive-0.010
 lav-0.009
Token pit
Token ID6028
Feature activation+4.17e-02

Pos logit contributions
eling+0.012
 gn+0.012
 why+0.011
 tale+0.011
land+0.011
Neg logit contributions
apped-0.013
enge-0.012
amed-0.012
umbled-0.012
ay-0.012
Token and
Token ID290
Feature activation-1.43e-02

Pos logit contributions
eling+0.049
land+0.048
itive+0.047
 tale+0.047
 gn+0.045
Neg logit contributions
apped-0.075
enge-0.073
amed-0.071
ay-0.071
umbled-0.070
Token started
Token ID2067
Feature activation-1.73e-02

Pos logit contributions
ay+0.025
apped+0.025
enge+0.024
umbled+0.024
amed+0.023
Neg logit contributions
eling-0.019
 gn-0.018
 why-0.018
land-0.017
 scr-0.017
Token to
Token ID284
Feature activation+1.94e-02

Pos logit contributions
apped+0.011
umbled+0.008
amed+0.008
enge+0.007
box+0.007
Neg logit contributions
itive-0.042
 tale-0.041
eling-0.041
land-0.041
mmm-0.040
Token gn
Token ID19967
Feature activation+7.91e-02

Pos logit contributions
land+0.031
 tale+0.030
itive+0.029
leness+0.027
mmm+0.027
Neg logit contributions
 block-0.023
case-0.021
apped-0.021
enge-0.021
 home-0.021
Tokenaw
Token ID707
Feature activation+1.12e-02

Pos logit contributions
eling+0.130
land+0.127
 tale+0.124
itive+0.123
 gn+0.118
Neg logit contributions
apped-0.110
enge-0.104
ay-0.103
amed-0.102
umbled-0.100
Token on
Token ID319
Feature activation+7.20e-03

Pos logit contributions
eling+0.014
 tale+0.013
 tul+0.013
 lav+0.013
itive+0.012
Neg logit contributions
apped-0.021
umbled-0.019
amed-0.019
enge-0.019
ay-0.018
Token a
Token ID257
Feature activation+2.37e-02

Pos logit contributions
 tale+0.013
itive+0.013
eling+0.012
land+0.012
 lull+0.012
Neg logit contributions
apped-0.008
ay-0.008
umbled-0.008
enge-0.007
box-0.007
Token bone
Token ID9970
Feature activation+1.90e-02

Pos logit contributions
eling+0.039
itive+0.038
 tale+0.036
 lav+0.036
land+0.036
Neg logit contributions
ay-0.032
apped-0.029
enge-0.029
amed-0.029
umbled-0.028
Token.
Token ID13
Feature activation-1.14e-03

Pos logit contributions
eling+0.011
land+0.011
itive+0.011
 tale+0.011
 gn+0.010
Neg logit contributions
apped-0.045
enge-0.044
amed-0.044
ay-0.043
umbled-0.043
Token root
Token ID6808
Feature activation+1.88e-02

Pos logit contributions
apped+0.005
ay+0.005
enge+0.005
umbled+0.005
amed+0.005
Neg logit contributions
eling-0.004
itive-0.004
land-0.004
 tale-0.004
 lav-0.003
Token,
Token ID11
Feature activation+2.13e-03

Pos logit contributions
eling+0.024
itive+0.022
land+0.022
 lav+0.021
 tale+0.021
Neg logit contributions
apped-0.033
ay-0.030
amed-0.030
umbled-0.030
enge-0.030
Token larger
Token ID4025
Feature activation-3.01e-03

Pos logit contributions
eling+0.002
land+0.002
itive+0.002
 tale+0.002
 lav+0.002
Neg logit contributions
apped-0.004
enge-0.004
amed-0.004
ay-0.004
umbled-0.004
Token and
Token ID290
Feature activation+3.68e-03

Pos logit contributions
apped+0.005
enge+0.005
ay+0.005
umbled+0.005
amed+0.004
Neg logit contributions
eling-0.004
land-0.004
 tale-0.004
itive-0.004
 lav-0.004
Token more
Token ID517
Feature activation-5.93e-03

Pos logit contributions
eling+0.004
land+0.004
itive+0.004
 tale+0.004
 why+0.004
Neg logit contributions
apped-0.007
enge-0.007
ay-0.007
umbled-0.006
amed-0.006
Token gn
Token ID19967
Feature activation+7.62e-02

Pos logit contributions
apped+0.010
ay+0.009
umbled+0.009
enge+0.009
amed+0.009
Neg logit contributions
eling-0.008
land-0.007
 tale-0.007
itive-0.007
 lav-0.007
Tokenar
Token ID283
Feature activation+1.82e-02

Pos logit contributions
eling+0.119
land+0.115
 tale+0.114
itive+0.113
 lav+0.112
Neg logit contributions
apped-0.112
enge-0.107
amed-0.103
umbled-0.103
ay-0.102
Tokenled
Token ID992
Feature activation-7.14e-03

Pos logit contributions
eling+0.029
land+0.028
 tale+0.028
itive+0.027
 lav+0.027
Neg logit contributions
apped-0.026
enge-0.024
amed-0.024
umbled-0.024
ay-0.024
Token than
Token ID621
Feature activation+5.71e-03

Pos logit contributions
apped+0.010
enge+0.010
ay+0.010
amed+0.010
umbled+0.009
Neg logit contributions
eling-0.011
land-0.011
itive-0.011
 tale-0.011
 why-0.010
Token the
Token ID262
Feature activation-6.79e-03

Pos logit contributions
eling+0.006
land+0.006
itive+0.006
 why+0.006
 tale+0.006
Neg logit contributions
apped-0.011
enge-0.011
amed-0.011
ay-0.011
umbled-0.010
Token others
Token ID1854
Feature activation+1.83e-02

Pos logit contributions
apped+0.011
enge+0.010
ay+0.010
amed+0.010
umbled+0.010
Neg logit contributions
eling-0.010
land-0.009
 tale-0.009
itive-0.009
 lav-0.009
Token animals
Token ID4695
Feature activation+1.70e-03

Pos logit contributions
apped+0.003
enge+0.003
amed+0.003
umbled+0.003
ay+0.003
Neg logit contributions
eling-0.003
land-0.003
itive-0.003
 tale-0.003
 gn-0.003
Token stay
Token ID2652
Feature activation+2.47e-02

Pos logit contributions
eling+0.002
itive+0.002
land+0.002
 tale+0.002
 lav+0.002
Neg logit contributions
apped-0.003
enge-0.003
ay-0.003
amed-0.003
umbled-0.003
Token in
Token ID287
Feature activation+1.74e-02

Pos logit contributions
eling+0.040
land+0.039
 tul+0.039
itive+0.038
 gn+0.038
Neg logit contributions
apped-0.035
amed-0.034
ay-0.031
enge-0.031
umbled-0.029
Token their
Token ID511
Feature activation+2.66e-02

Pos logit contributions
eling+0.026
land+0.025
itive+0.025
 tale+0.023
 scr+0.023
Neg logit contributions
apped-0.025
ay-0.025
enge-0.024
amed-0.024
umbled-0.023
Token natural
Token ID3288
Feature activation+9.80e-03

Pos logit contributions
eling+0.039
 tale+0.039
land+0.038
 lav+0.036
 why+0.036
Neg logit contributions
apped-0.043
enge-0.041
amed-0.040
umbled-0.040
ay-0.039
Token habitat
Token ID20018
Feature activation+7.50e-02

Pos logit contributions
eling+0.012
land+0.012
itive+0.012
 tale+0.012
 lav+0.011
Neg logit contributions
apped-0.017
enge-0.017
ay-0.016
amed-0.016
umbled-0.016
Token and
Token ID290
Feature activation+1.00e-02

Pos logit contributions
land+0.068
eling+0.067
 tale+0.064
itive+0.063
 why+0.059
Neg logit contributions
apped-0.158
enge-0.153
amed-0.149
umbled-0.148
ay-0.148
Token not
Token ID407
Feature activation+7.27e-03

Pos logit contributions
eling+0.013
land+0.012
itive+0.012
 tale+0.012
 lav+0.011
Neg logit contributions
apped-0.018
enge-0.017
amed-0.016
ay-0.016
umbled-0.016
Token take
Token ID1011
Feature activation+2.76e-02

Pos logit contributions
eling+0.011
itive+0.010
land+0.010
 tale+0.010
 lav+0.010
Neg logit contributions
apped-0.011
enge-0.010
umbled-0.010
ay-0.010
amed-0.010
Token them
Token ID606
Feature activation+1.95e-02

Pos logit contributions
eling+0.033
itive+0.033
land+0.032
 gn+0.031
 tale+0.031
Neg logit contributions
apped-0.049
enge-0.047
ay-0.047
amed-0.046
umbled-0.045
Token away
Token ID1497
Feature activation-3.28e-03

Pos logit contributions
land+0.032
eling+0.032
 tale+0.031
itive+0.030
 why+0.030
Neg logit contributions
apped-0.027
enge-0.026
umbled-0.025
amed-0.025
ay-0.025
Token children
Token ID1751
Feature activation+9.24e-03

Pos logit contributions
enge+0.007
apped+0.007
umbled+0.007
ase+0.007
ay+0.007
Neg logit contributions
eling-0.004
 why-0.004
 gn-0.004
 scr-0.004
land-0.004
Token put
Token ID1234
Feature activation+9.47e-03

Pos logit contributions
eling+0.010
land+0.010
itive+0.010
 tale+0.010
 why+0.010
Neg logit contributions
enge-0.017
apped-0.017
ay-0.017
amed-0.016
umbled-0.016
Token on
Token ID319
Feature activation-1.21e-02

Pos logit contributions
eling+0.011
land+0.011
 lav+0.011
 why+0.011
 tale+0.011
Neg logit contributions
apped-0.016
umbled-0.016
enge-0.016
amed-0.015
ay-0.015
Token their
Token ID511
Feature activation+1.58e-02

Pos logit contributions
apped+0.023
enge+0.022
umbled+0.022
amed+0.021
ay+0.021
Neg logit contributions
eling-0.014
land-0.014
 tale-0.013
 lav-0.013
 tul-0.013
Token bath
Token ID7837
Feature activation+3.97e-02

Pos logit contributions
land+0.020
eling+0.020
 tale+0.020
 lav+0.019
 scr+0.019
Neg logit contributions
apped-0.028
enge-0.027
umbled-0.027
amed-0.026
ay-0.026
Tokenro
Token ID305
Feature activation+7.47e-02

Pos logit contributions
land+0.067
eling+0.066
 tale+0.064
itive+0.064
 lav+0.062
Neg logit contributions
apped-0.052
enge-0.050
amed-0.048
umbled-0.048
ay-0.048
Tokenbes
Token ID12636
Feature activation+5.78e-02

Pos logit contributions
eling+0.098
 why+0.089
 tale+0.089
itive+0.089
 lav+0.088
Neg logit contributions
apped-0.131
amed-0.127
umbled-0.119
enge-0.117
ay-0.117
Token and
Token ID290
Feature activation+9.27e-03

Pos logit contributions
land+0.056
eling+0.053
 tale+0.052
itive+0.051
 why+0.048
Neg logit contributions
apped-0.117
enge-0.115
umbled-0.111
amed-0.111
ay-0.111
Token ran
Token ID4966
Feature activation+1.37e-02

Pos logit contributions
eling+0.011
land+0.011
 why+0.011
 lav+0.011
 gn+0.011
Neg logit contributions
apped-0.017
enge-0.016
ay-0.016
umbled-0.016
amed-0.016
Token around
Token ID1088
Feature activation-4.08e-03

Pos logit contributions
land+0.024
 why+0.021
 tale+0.021
eling+0.021
 hesitation+0.021
Neg logit contributions
apped-0.018
enge-0.018
umbled-0.017
ay-0.017
amed-0.016
Token in
Token ID287
Feature activation+2.05e-03

Pos logit contributions
apped+0.007
enge+0.007
amed+0.007
ay+0.007
umbled+0.007
Neg logit contributions
eling-0.005
land-0.005
itive-0.005
 tale-0.005
 lav-0.005
TokenHis
Token ID6653
Feature activation+1.52e-02

Pos logit contributions
eling+0.003
land+0.003
itive+0.003
 tale+0.003
 lav+0.003
Neg logit contributions
apped-0.005
enge-0.005
amed-0.005
ay-0.005
umbled-0.005
Token mom
Token ID1995
Feature activation-6.17e-03

Pos logit contributions
 why+0.020
 tale+0.020
 gn+0.019
 hesitation+0.019
 scr+0.019
Neg logit contributions
umbled-0.025
apped-0.025
enge-0.025
amed-0.025
ase-0.023
Token said
Token ID531
Feature activation+1.03e-02

Pos logit contributions
apped+0.009
enge+0.009
amed+0.009
ay+0.009
umbled+0.009
Neg logit contributions
eling-0.009
land-0.009
itive-0.009
 tale-0.009
 lav-0.008
Token,
Token ID11
Feature activation+1.52e-02

Pos logit contributions
eling+0.013
 tale+0.011
land+0.011
itive+0.011
 tul+0.010
Neg logit contributions
apped-0.020
amed-0.018
ay-0.018
enge-0.018
umbled-0.017
Token "
Token ID366
Feature activation+3.33e-02

Pos logit contributions
eling+0.028
 tale+0.025
land+0.025
itive+0.025
leness+0.023
Neg logit contributions
apped-0.021
amed-0.018
ay-0.018
box-0.017
enge-0.017
TokenWhy
Token ID5195
Feature activation+7.43e-02

Pos logit contributions
eling+0.042
land+0.041
itive+0.040
 tale+0.040
 lav+0.038
Neg logit contributions
apped-0.058
enge-0.055
ay-0.055
amed-0.054
umbled-0.054
Token don
Token ID836
Feature activation-3.28e-04

Pos logit contributions
eling+0.097
land+0.094
 tale+0.093
itive+0.091
 lav+0.090
Neg logit contributions
apped-0.126
enge-0.120
ay-0.118
umbled-0.115
amed-0.114
Token't
Token ID470
Feature activation-2.01e-02

Pos logit contributions
apped+0.000
enge+0.000
ay+0.000
amed+0.000
umbled+0.000
Neg logit contributions
eling-0.001
land-0.001
itive-0.001
 tale-0.001
 lav-0.001
Token you
Token ID345
Feature activation-4.22e-03

Pos logit contributions
apped+0.025
ay+0.024
enge+0.023
umbled+0.023
amed+0.021
Neg logit contributions
land-0.035
eling-0.034
 tale-0.034
 lav-0.033
itive-0.033
Token invite
Token ID14037
Feature activation-1.55e-02

Pos logit contributions
apped+0.008
enge+0.008
amed+0.008
umbled+0.008
ay+0.008
Neg logit contributions
eling-0.005
land-0.004
 tale-0.004
itive-0.004
 why-0.004
Token your
Token ID534
Feature activation-1.70e-02

Pos logit contributions
apped+0.027
umbled+0.026
amed+0.025
enge+0.025
ay+0.024
Neg logit contributions
 why-0.021
eling-0.020
land-0.020
 tale-0.019
 scr-0.018
Token away
Token ID1497
Feature activation-1.47e-02

Pos logit contributions
apped+0.006
enge+0.005
umbled+0.005
ay+0.005
amed+0.005
Neg logit contributions
land-0.003
eling-0.003
 why-0.003
itive-0.003
 tale-0.003
Token in
Token ID287
Feature activation+1.65e-02

Pos logit contributions
apped+0.030
enge+0.029
ay+0.029
amed+0.028
umbled+0.028
Neg logit contributions
eling-0.014
 tale-0.013
itive-0.013
land-0.013
 lav-0.012
Token her
Token ID607
Feature activation+2.67e-02

Pos logit contributions
eling+0.018
land+0.018
 tale+0.017
 why+0.017
 hesitation+0.017
Neg logit contributions
apped-0.032
enge-0.031
amed-0.030
umbled-0.030
ay-0.029
Token pig
Token ID12967
Feature activation+2.36e-02

Pos logit contributions
 tale+0.039
land+0.038
eling+0.038
 gn+0.036
 scr+0.036
Neg logit contributions
apped-0.042
umbled-0.040
enge-0.040
amed-0.040
ay-0.038
Tokengy
Token ID1360
Feature activation+6.02e-02

Pos logit contributions
eling+0.026
land+0.026
 tale+0.025
itive+0.025
 gn+0.023
Neg logit contributions
apped-0.045
enge-0.043
amed-0.042
ay-0.042
umbled-0.042
Tokenbank
Token ID17796
Feature activation+7.37e-02

Pos logit contributions
eling+0.074
land+0.073
 tale+0.072
itive+0.070
 gn+0.067
Neg logit contributions
apped-0.107
enge-0.103
amed-0.101
ay-0.100
umbled-0.100
Token every
Token ID790
Feature activation-3.32e-03

Pos logit contributions
eling+0.064
land+0.058
 tale+0.058
itive+0.057
 lav+0.055
Neg logit contributions
apped-0.160
enge-0.152
amed-0.151
ay-0.150
umbled-0.150
Token week
Token ID1285
Feature activation+4.69e-03

Pos logit contributions
apped+0.006
enge+0.006
amed+0.005
umbled+0.005
ay+0.005
Neg logit contributions
eling-0.004
 tale-0.004
land-0.004
itive-0.004
 why-0.004
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
eling+0.004
land+0.004
itive+0.004
 tale+0.004
 lav+0.003
Neg logit contributions
apped-0.010
enge-0.010
amed-0.010
ay-0.010
umbled-0.010
Token This
Token ID770
Feature activation+1.84e-02

Pos logit contributions
apped+0.021
ay+0.020
enge+0.019
umbled+0.019
amed+0.019
Neg logit contributions
eling-0.016
itive-0.016
land-0.015
 tale-0.015
 hesitation-0.014
Token way
Token ID835
Feature activation-1.52e-02

Pos logit contributions
eling+0.018
land+0.017
 tale+0.017
itive+0.017
 lav+0.016
Neg logit contributions
apped-0.037
enge-0.036
amed-0.035
ay-0.035
umbled-0.035
Token mom
Token ID1995
Feature activation-2.04e-02

Pos logit contributions
eling+0.006
land+0.006
 tale+0.006
itive+0.006
 lav+0.006
Neg logit contributions
apped-0.008
enge-0.007
amed-0.007
umbled-0.007
ay-0.007
Token said
Token ID531
Feature activation-6.40e-03

Pos logit contributions
apped+0.031
enge+0.029
ay+0.029
amed+0.029
umbled+0.028
Neg logit contributions
eling-0.031
land-0.030
itive-0.029
 tale-0.029
 why-0.028
Token.
Token ID13
Feature activation+5.71e-03

Pos logit contributions
apped+0.013
amed+0.011
umbled+0.011
enge+0.011
ay+0.011
Neg logit contributions
eling-0.008
 tale-0.007
land-0.007
 tul-0.007
itive-0.007
Token "
Token ID366
Feature activation+1.98e-02

Pos logit contributions
eling+0.009
itive+0.008
 tale+0.008
land+0.008
 hesitation+0.007
Neg logit contributions
apped-0.009
ay-0.009
enge-0.009
amed-0.009
umbled-0.008
TokenBut
Token ID1537
Feature activation+7.38e-03

Pos logit contributions
eling+0.023
land+0.022
itive+0.022
 tale+0.022
 lav+0.021
Neg logit contributions
apped-0.037
enge-0.035
ay-0.035
amed-0.034
umbled-0.034
Token why
Token ID1521
Feature activation+7.36e-02

Pos logit contributions
land+0.010
eling+0.010
 tale+0.010
itive+0.010
 gn+0.009
Neg logit contributions
apped-0.012
enge-0.011
ay-0.011
amed-0.010
umbled-0.010
Token are
Token ID389
Feature activation-1.32e-03

Pos logit contributions
eling+0.092
land+0.090
 tale+0.088
itive+0.087
 gn+0.085
Neg logit contributions
apped-0.128
enge-0.121
ay-0.120
umbled-0.116
amed-0.114
Token you
Token ID345
Feature activation+8.43e-03

Pos logit contributions
apped+0.002
amed+0.002
enge+0.002
ay+0.002
umbled+0.002
Neg logit contributions
eling-0.002
 why-0.002
land-0.002
 tale-0.002
 lav-0.002
Token so
Token ID523
Feature activation+5.27e-04

Pos logit contributions
eling+0.010
land+0.010
 tale+0.010
itive+0.010
 lav+0.009
Neg logit contributions
apped-0.015
enge-0.014
amed-0.014
ay-0.014
umbled-0.014
Token dirty
Token ID11841
Feature activation-1.61e-02

Pos logit contributions
eling+0.001
itive+0.001
land+0.001
 tale+0.001
 scr+0.001
Neg logit contributions
apped-0.001
enge-0.001
amed-0.001
ay-0.001
umbled-0.001
Token?"
Token ID1701
Feature activation+1.55e-02

Pos logit contributions
apped+0.029
enge+0.028
amed+0.028
umbled+0.028
ay+0.028
Neg logit contributions
eling-0.019
land-0.019
itive-0.018
 tale-0.018
 why-0.018
Token said
Token ID531
Feature activation-3.04e-03

Pos logit contributions
eling+0.003
land+0.003
 tale+0.003
itive+0.003
 lav+0.003
Neg logit contributions
apped-0.003
enge-0.003
amed-0.003
ay-0.003
umbled-0.003
Token.
Token ID13
Feature activation+1.15e-02

Pos logit contributions
apped+0.006
umbled+0.005
amed+0.005
enge+0.005
ay+0.005
Neg logit contributions
eling-0.004
 tale-0.004
land-0.004
 tul-0.004
 lav-0.003
Token "
Token ID366
Feature activation+1.87e-02

Pos logit contributions
eling+0.019
 tale+0.018
land+0.017
itive+0.017
 hesitation+0.016
Neg logit contributions
apped-0.018
umbled-0.016
enge-0.016
ay-0.016
amed-0.016
TokenThat
Token ID2504
Feature activation+4.55e-03

Pos logit contributions
eling+0.025
land+0.024
itive+0.024
 tale+0.024
 lav+0.023
Neg logit contributions
apped-0.031
enge-0.030
ay-0.029
amed-0.029
umbled-0.029
Token's
Token ID338
Feature activation+5.49e-03

Pos logit contributions
land+0.005
 why+0.005
itive+0.005
 tale+0.005
eling+0.005
Neg logit contributions
apped-0.009
amed-0.009
enge-0.009
umbled-0.008
ay-0.008
Token why
Token ID1521
Feature activation+7.35e-02

Pos logit contributions
land+0.009
itive+0.008
 tale+0.008
eling+0.008
yt+0.008
Neg logit contributions
apped-0.008
enge-0.008
amed-0.008
ay-0.007
box-0.007
Token it
Token ID340
Feature activation-2.65e-03

Pos logit contributions
eling+0.102
land+0.100
 tale+0.098
itive+0.098
 lav+0.094
Neg logit contributions
apped-0.118
enge-0.113
ay-0.111
amed-0.110
umbled-0.109
Token is
Token ID318
Feature activation+7.60e-03

Pos logit contributions
apped+0.004
enge+0.004
amed+0.004
ay+0.004
umbled+0.004
Neg logit contributions
eling-0.004
land-0.003
itive-0.003
 tale-0.003
 why-0.003
Token so
Token ID523
Feature activation+1.07e-02

Pos logit contributions
eling+0.009
land+0.009
itive+0.009
 tale+0.009
 lav+0.009
Neg logit contributions
apped-0.014
enge-0.013
ay-0.013
amed-0.013
umbled-0.013
Token important
Token ID1593
Feature activation+6.47e-03

Pos logit contributions
eling+0.014
land+0.013
itive+0.013
 tale+0.013
 lav+0.013
Neg logit contributions
apped-0.019
enge-0.018
amed-0.018
ay-0.017
umbled-0.017
Token to
Token ID284
Feature activation+9.52e-03

Pos logit contributions
eling+0.010
land+0.009
 tale+0.009
itive+0.009
 lav+0.009
Neg logit contributions
apped-0.010
enge-0.009
ay-0.009
umbled-0.009
amed-0.009
Token mom
Token ID1995
Feature activation-1.65e-03

Pos logit contributions
 gn+0.001
 why+0.001
itive+0.001
 lav+0.001
 tul+0.001
Neg logit contributions
apped-0.001
amed-0.001
umbled-0.001
enge-0.001
ay-0.001
Token asked
Token ID1965
Feature activation+2.38e-02

Pos logit contributions
apped+0.002
enge+0.002
amed+0.002
ay+0.002
umbled+0.002
Neg logit contributions
eling-0.003
land-0.003
itive-0.002
 tale-0.002
 lav-0.002
Token him
Token ID683
Feature activation+1.11e-02

Pos logit contributions
 tale+0.049
land+0.049
eling+0.049
itive+0.047
 lav+0.046
Neg logit contributions
apped-0.022
ay-0.020
enge-0.020
umbled-0.018
amed-0.017
Token,
Token ID11
Feature activation+1.59e-02

Pos logit contributions
eling+0.017
land+0.016
itive+0.016
 tale+0.015
 scr+0.015
Neg logit contributions
apped-0.017
enge-0.016
ay-0.016
amed-0.016
umbled-0.015
Token "
Token ID366
Feature activation+3.22e-02

Pos logit contributions
eling+0.032
land+0.031
 tale+0.031
leness+0.030
itive+0.030
Neg logit contributions
apped-0.017
 of-0.013
ay-0.013
box-0.013
amed-0.013
TokenWhy
Token ID5195
Feature activation+7.34e-02

Pos logit contributions
eling+0.029
land+0.028
 tale+0.027
itive+0.027
 lav+0.025
Neg logit contributions
apped-0.068
enge-0.066
amed-0.065
ay-0.065
umbled-0.064
Token are
Token ID389
Feature activation-6.80e-03

Pos logit contributions
 why+0.089
itive+0.088
 seem+0.088
eling+0.086
 what+0.082
Neg logit contributions
amed-0.136
apped-0.131
enge-0.127
umbled-0.125
amer-0.121
Token you
Token ID345
Feature activation+1.06e-02

Pos logit contributions
apped+0.011
enge+0.011
amed+0.010
ay+0.010
umbled+0.010
Neg logit contributions
eling-0.010
itive-0.009
land-0.009
 tale-0.009
 why-0.009
Token sad
Token ID6507
Feature activation+2.39e-02

Pos logit contributions
eling+0.013
land+0.013
itive+0.013
 tale+0.012
 lav+0.012
Neg logit contributions
apped-0.019
enge-0.018
ay-0.018
amed-0.018
umbled-0.017
Token,
Token ID11
Feature activation+7.57e-03

Pos logit contributions
eling+0.023
land+0.022
itive+0.022
 tale+0.022
 lav+0.020
Neg logit contributions
apped-0.049
enge-0.047
amed-0.047
ay-0.047
umbled-0.046
Token Tim
Token ID5045
Feature activation+7.07e-03

Pos logit contributions
 why+0.012
eling+0.012
 scr+0.011
itive+0.011
land+0.011
Neg logit contributions
apped-0.011
amed-0.011
enge-0.011
umbled-0.010
ase-0.010
Token said
Token ID531
Feature activation+9.68e-03

Pos logit contributions
eling+0.021
land+0.021
itive+0.021
 tale+0.021
 lav+0.020
Neg logit contributions
apped-0.021
enge-0.019
amed-0.019
ay-0.019
umbled-0.018
Token,
Token ID11
Feature activation+1.49e-02

Pos logit contributions
eling+0.014
 tale+0.012
land+0.012
itive+0.011
uf+0.011
Neg logit contributions
apped-0.018
ay-0.016
enge-0.016
amed-0.015
umbled-0.015
Token "
Token ID366
Feature activation+2.97e-02

Pos logit contributions
eling+0.033
 tale+0.030
land+0.029
itive+0.029
leness+0.028
Neg logit contributions
apped-0.017
amed-0.013
ay-0.013
umbled-0.012
enge-0.012
TokenThat
Token ID2504
Feature activation+1.30e-02

Pos logit contributions
eling+0.039
land+0.039
itive+0.038
 tale+0.038
 lav+0.037
Neg logit contributions
apped-0.049
enge-0.046
amed-0.045
umbled-0.044
ay-0.044
Token is
Token ID318
Feature activation+1.29e-02

Pos logit contributions
eling+0.017
land+0.017
 tale+0.017
itive+0.017
 lav+0.016
Neg logit contributions
apped-0.022
enge-0.021
amed-0.020
ay-0.020
umbled-0.020
Token why
Token ID1521
Feature activation+7.34e-02

Pos logit contributions
itive+0.019
land+0.019
eling+0.018
 tale+0.018
 hesitation+0.017
Neg logit contributions
enge-0.019
apped-0.019
ay-0.019
amed-0.019
box-0.017
Token we
Token ID356
Feature activation-1.44e-03

Pos logit contributions
eling+0.105
land+0.104
 tale+0.102
itive+0.101
 gn+0.097
Neg logit contributions
apped-0.114
enge-0.108
ay-0.107
umbled-0.104
amed-0.103
Token need
Token ID761
Feature activation-1.83e-02

Pos logit contributions
apped+0.003
enge+0.003
umbled+0.002
ay+0.002
amed+0.002
Neg logit contributions
eling-0.002
 why-0.002
 seem-0.002
 scr-0.002
itive-0.002
Token to
Token ID284
Feature activation+1.95e-02

Pos logit contributions
apped+0.029
enge+0.028
amed+0.027
ay+0.027
umbled+0.027
Neg logit contributions
eling-0.026
land-0.025
 tale-0.024
itive-0.024
 lav-0.024
Token brush
Token ID14093
Feature activation-5.16e-04

Pos logit contributions
itive+0.031
 tale+0.031
land+0.030
eling+0.030
 lav+0.028
Neg logit contributions
ay-0.026
apped-0.025
enge-0.023
amed-0.023
umbled-0.022
Token off
Token ID572
Feature activation-9.25e-03

Pos logit contributions
apped+0.001
ay+0.001
enge+0.001
amed+0.001
umbled+0.001
Neg logit contributions
eling-0.001
itive-0.001
 tale-0.001
land-0.001
 gn-0.001
TokenHis
Token ID6653
Feature activation+2.06e-02

Pos logit contributions
eling+0.013
land+0.012
 tale+0.012
itive+0.012
 lav+0.012
Neg logit contributions
apped-0.013
enge-0.013
ay-0.013
amed-0.012
umbled-0.012
Token mom
Token ID1995
Feature activation-7.32e-03

Pos logit contributions
 gn+0.028
 scr+0.028
 why+0.028
 tale+0.028
 hesitation+0.027
Neg logit contributions
apped-0.033
enge-0.032
amed-0.031
umbled-0.031
ase-0.029
Token asked
Token ID1965
Feature activation+1.65e-02

Pos logit contributions
apped+0.013
enge+0.013
ay+0.012
amed+0.012
umbled+0.012
Neg logit contributions
eling-0.009
 why-0.009
itive-0.008
 tale-0.008
land-0.008
Token,
Token ID11
Feature activation+1.50e-02

Pos logit contributions
 tale+0.032
eling+0.031
land+0.031
 lav+0.030
 carniv+0.030
Neg logit contributions
apped-0.017
ay-0.016
enge-0.016
box-0.014
umbled-0.014
Token "
Token ID366
Feature activation+3.12e-02

Pos logit contributions
 tale+0.033
eling+0.033
land+0.033
leness+0.031
itive+0.030
Neg logit contributions
apped-0.015
box-0.010
amed-0.010
ay-0.010
enge-0.010
TokenWhy
Token ID5195
Feature activation+7.33e-02

Pos logit contributions
land+0.026
 tale+0.026
eling+0.025
itive+0.025
 hesitation+0.023
Neg logit contributions
apped-0.066
enge-0.066
amed-0.064
umbled-0.063
ay-0.063
Token do
Token ID466
Feature activation+9.88e-03

Pos logit contributions
eling+0.090
itive+0.088
 why+0.086
land+0.086
 tale+0.083
Neg logit contributions
apped-0.131
amed-0.129
enge-0.126
umbled-0.123
ay-0.122
Token you
Token ID345
Feature activation+3.00e-04

Pos logit contributions
land+0.017
eling+0.017
itive+0.017
 tale+0.016
 lav+0.016
Neg logit contributions
apped-0.012
enge-0.012
ay-0.012
umbled-0.011
amed-0.010
Token refuse
Token ID11148
Feature activation+3.93e-02

Pos logit contributions
land+0.000
eling+0.000
 tale+0.000
itive+0.000
 lav+0.000
Neg logit contributions
apped-0.001
enge-0.000
ay-0.000
amed-0.000
umbled-0.000
Token to
Token ID284
Feature activation+1.68e-02

Pos logit contributions
eling+0.060
land+0.058
 why+0.057
 tale+0.057
itive+0.056
Neg logit contributions
apped-0.060
enge-0.057
amed-0.056
ay-0.055
umbled-0.055
Token play
Token ID711
Feature activation-1.17e-02

Pos logit contributions
itive+0.025
eling+0.025
land+0.025
 tale+0.024
 lav+0.023
Neg logit contributions
apped-0.025
ay-0.024
enge-0.024
amed-0.023
umbled-0.022
Token see
Token ID766
Feature activation-9.10e-03

Pos logit contributions
eling+0.005
land+0.005
 tale+0.005
itive+0.005
 why+0.005
Neg logit contributions
apped-0.008
enge-0.007
ay-0.007
umbled-0.007
amed-0.007
Token animals
Token ID4695
Feature activation+5.65e-03

Pos logit contributions
apped+0.014
enge+0.013
ay+0.013
amed+0.013
umbled+0.013
Neg logit contributions
eling-0.013
itive-0.013
land-0.013
 tale-0.013
 lav-0.012
Token in
Token ID287
Feature activation+4.86e-03

Pos logit contributions
eling+0.007
itive+0.007
 lav+0.007
land+0.006
 tul+0.006
Neg logit contributions
apped-0.010
ay-0.009
enge-0.009
amed-0.009
umbled-0.009
Token their
Token ID511
Feature activation+2.46e-02

Pos logit contributions
eling+0.007
itive+0.007
land+0.007
 tale+0.006
 lav+0.006
Neg logit contributions
apped-0.008
ay-0.007
enge-0.007
amed-0.007
umbled-0.007
Token natural
Token ID3288
Feature activation+7.06e-03

Pos logit contributions
eling+0.024
land+0.024
 tale+0.024
 gn+0.022
itive+0.022
Neg logit contributions
apped-0.050
enge-0.048
umbled-0.047
amed-0.047
ay-0.046
Token habitat
Token ID20018
Feature activation+7.33e-02

Pos logit contributions
eling+0.008
land+0.008
 tale+0.008
itive+0.008
 gn+0.008
Neg logit contributions
apped-0.013
enge-0.012
amed-0.012
ay-0.012
umbled-0.012
Token.
Token ID13
Feature activation-7.00e-03

Pos logit contributions
eling+0.057
land+0.052
itive+0.051
 tale+0.051
 lav+0.048
Neg logit contributions
apped-0.165
enge-0.159
ay-0.157
amed-0.157
umbled-0.156
Token She
Token ID1375
Feature activation+4.01e-03

Pos logit contributions
apped+0.011
ay+0.011
enge+0.010
umbled+0.010
amed+0.010
Neg logit contributions
eling-0.011
itive-0.011
 tale-0.010
 hesitation-0.010
 lav-0.010
Token ran
Token ID4966
Feature activation-3.31e-03

Pos logit contributions
eling+0.005
 gn+0.005
 why+0.005
land+0.005
 scr+0.004
Neg logit contributions
ay-0.008
apped-0.007
umbled-0.007
enge-0.007
ase-0.007
Token around
Token ID1088
Feature activation-4.32e-03

Pos logit contributions
apped+0.006
umbled+0.005
enge+0.005
ase+0.005
ay+0.005
Neg logit contributions
land-0.004
 gn-0.004
 why-0.004
 tale-0.004
 while-0.004
Token the
Token ID262
Feature activation+4.74e-04

Pos logit contributions
apped+0.008
enge+0.007
ay+0.007
umbled+0.007
amed+0.007
Neg logit contributions
eling-0.005
land-0.005
 tale-0.005
itive-0.005
 gn-0.005
TokenHis
Token ID6653
Feature activation+2.06e-02

Pos logit contributions
land+0.011
 scr+0.010
 gn+0.010
 why+0.010
itive+0.010
Neg logit contributions
apped-0.027
enge-0.026
umbled-0.025
amed-0.025
ane-0.024
Token mom
Token ID1995
Feature activation-5.03e-03

Pos logit contributions
 scr+0.026
 gn+0.026
 tale+0.026
 why+0.026
itive+0.025
Neg logit contributions
amed-0.034
apped-0.034
enge-0.034
umbled-0.034
ase-0.032
Token said
Token ID531
Feature activation+8.95e-03

Pos logit contributions
apped+0.009
enge+0.009
ay+0.008
umbled+0.008
amed+0.008
Neg logit contributions
eling-0.006
 why-0.006
itive-0.006
 tale-0.006
land-0.006
Token,
Token ID11
Feature activation+1.88e-02

Pos logit contributions
eling+0.012
land+0.010
 tale+0.010
itive+0.010
 tul+0.009
Neg logit contributions
apped-0.017
enge-0.015
ay-0.015
amed-0.015
umbled-0.015
Token "
Token ID366
Feature activation+3.57e-02

Pos logit contributions
eling+0.040
 tale+0.038
land+0.037
leness+0.034
itive+0.034
Neg logit contributions
apped-0.021
ay-0.016
box-0.016
amed-0.016
enge-0.015
TokenWhy
Token ID5195
Feature activation+7.33e-02

Pos logit contributions
eling+0.047
land+0.045
 tale+0.044
itive+0.044
 lav+0.043
Neg logit contributions
apped-0.062
enge-0.057
amed-0.057
ay-0.057
umbled-0.056
Token do
Token ID466
Feature activation+1.26e-02

Pos logit contributions
eling+0.079
land+0.075
 tale+0.074
itive+0.073
 lav+0.070
Neg logit contributions
apped-0.143
enge-0.137
ay-0.135
umbled-0.133
amed-0.133
Token you
Token ID345
Feature activation+4.11e-03

Pos logit contributions
land+0.023
eling+0.022
itive+0.022
 tale+0.021
 tul+0.021
Neg logit contributions
apped-0.014
enge-0.014
ay-0.014
umbled-0.013
 losers-0.012
Token want
Token ID765
Feature activation+1.82e-03

Pos logit contributions
eling+0.005
itive+0.005
 tale+0.005
land+0.005
 why+0.005
Neg logit contributions
apped-0.007
enge-0.007
amed-0.007
umbled-0.007
ay-0.007
Token to
Token ID284
Feature activation+1.76e-02

Pos logit contributions
 why+0.003
 lav+0.003
 tul+0.002
eling+0.002
 gn+0.002
Neg logit contributions
apped-0.003
amed-0.003
enge-0.003
umbled-0.003
ay-0.003
Token bury
Token ID30033
Feature activation+2.51e-02

Pos logit contributions
eling+0.023
land+0.022
itive+0.022
 tale+0.021
 lav+0.020
Neg logit contributions
apped-0.030
enge-0.029
ay-0.029
amed-0.028
umbled-0.028
Token mom
Token ID1995
Feature activation-6.60e-04

Pos logit contributions
 tale+0.035
 gn+0.034
 lav+0.034
itive+0.034
land+0.033
Neg logit contributions
apped-0.034
umbled-0.032
enge-0.032
amed-0.031
ay-0.030
Token asked
Token ID1965
Feature activation+2.21e-02

Pos logit contributions
apped+0.001
enge+0.001
ay+0.001
umbled+0.001
amed+0.001
Neg logit contributions
eling-0.001
 tale-0.001
 why-0.001
itive-0.001
land-0.001
Token her
Token ID607
Feature activation+8.49e-03

Pos logit contributions
eling+0.037
land+0.036
itive+0.036
 tale+0.035
 gn+0.034
Neg logit contributions
apped-0.029
ay-0.028
enge-0.026
ied-0.025
amed-0.024
Token,
Token ID11
Feature activation+1.45e-02

Pos logit contributions
eling+0.015
land+0.013
itive+0.013
 scr+0.013
 lookout+0.013
Neg logit contributions
apped-0.012
ay-0.010
amed-0.010
enge-0.010
ied-0.010
Token "
Token ID366
Feature activation+3.37e-02

Pos logit contributions
eling+0.031
land+0.029
 tale+0.029
leness+0.029
itive+0.028
Neg logit contributions
apped-0.015
amed-0.012
ay-0.011
enge-0.010
 of-0.010
TokenWhy
Token ID5195
Feature activation+7.30e-02

Pos logit contributions
eling+0.029
land+0.029
 tale+0.028
itive+0.028
 hesitation+0.026
Neg logit contributions
apped-0.069
enge-0.069
amed-0.068
ay-0.066
umbled-0.066
Token are
Token ID389
Feature activation-9.29e-03

Pos logit contributions
itive+0.096
eling+0.096
 why+0.095
 seem+0.093
land+0.091
Neg logit contributions
apped-0.122
amed-0.121
enge-0.117
umbled-0.117
ay-0.112
Token you
Token ID345
Feature activation+9.53e-03

Pos logit contributions
apped+0.015
enge+0.014
amed+0.014
ay+0.014
umbled+0.014
Neg logit contributions
eling-0.013
land-0.013
itive-0.013
 tale-0.013
 lav-0.012
Token sad
Token ID6507
Feature activation+2.27e-02

Pos logit contributions
eling+0.013
land+0.012
 tale+0.012
itive+0.012
 lav+0.012
Neg logit contributions
apped-0.016
amed-0.015
enge-0.015
ay-0.015
umbled-0.015
Token,
Token ID11
Feature activation+2.90e-03

Pos logit contributions
eling+0.023
land+0.021
 tale+0.021
itive+0.020
 lav+0.020
Neg logit contributions
apped-0.046
ay-0.044
enge-0.044
umbled-0.043
amed-0.043
Token Lily
Token ID20037
Feature activation+9.07e-03

Pos logit contributions
eling+0.004
land+0.004
itive+0.004
 tale+0.004
 hesitation+0.004
Neg logit contributions
apped-0.005
ay-0.005
enge-0.005
amed-0.004
umbled-0.004
Token asked
Token ID1965
Feature activation-4.36e-03

Pos logit contributions
eling+0.027
 tale+0.026
itive+0.025
land+0.025
 hesitation+0.024
Neg logit contributions
apped-0.036
enge-0.034
amed-0.034
ay-0.033
umbled-0.033
Token his
Token ID465
Feature activation-2.57e-04

Pos logit contributions
apped+0.007
enge+0.007
amed+0.007
ay+0.007
umbled+0.007
Neg logit contributions
eling-0.006
land-0.005
itive-0.005
 tale-0.005
 lav-0.005
Token mom
Token ID1995
Feature activation-5.53e-03

Pos logit contributions
apped+0.000
enge+0.000
umbled+0.000
amed+0.000
ay+0.000
Neg logit contributions
 why-0.000
land-0.000
 tale-0.000
eling-0.000
 gn-0.000
Token,
Token ID11
Feature activation+7.58e-03

Pos logit contributions
apped+0.009
enge+0.009
ay+0.009
amed+0.009
umbled+0.009
Neg logit contributions
eling-0.007
itive-0.007
 tale-0.007
land-0.007
 lav-0.007
Token "
Token ID366
Feature activation+2.47e-02

Pos logit contributions
eling+0.009
land+0.009
 tale+0.008
itive+0.008
 lav+0.008
Neg logit contributions
apped-0.014
enge-0.013
amed-0.013
ay-0.013
umbled-0.013
TokenWhy
Token ID5195
Feature activation+7.29e-02

Pos logit contributions
land+0.021
eling+0.020
 why+0.019
itive+0.019
 tale+0.019
Neg logit contributions
apped-0.054
ay-0.052
umbled-0.051
enge-0.051
amed-0.050
Token can
Token ID460
Feature activation+5.34e-04

Pos logit contributions
eling+0.101
land+0.098
itive+0.097
 why+0.097
 tale+0.096
Neg logit contributions
apped-0.120
amed-0.116
enge-0.114
umbled-0.113
ay-0.111
Token't
Token ID470
Feature activation-2.01e-02

Pos logit contributions
eling+0.001
 tale+0.001
 why+0.001
land+0.001
itive+0.001
Neg logit contributions
apped-0.001
amed-0.001
enge-0.001
umbled-0.001
ay-0.001
Token I
Token ID314
Feature activation+1.19e-02

Pos logit contributions
apped+0.035
amed+0.034
enge+0.033
umbled+0.032
ay+0.032
Neg logit contributions
eling-0.027
land-0.025
 tale-0.025
 why-0.025
itive-0.025
Token whistle
Token ID16121
Feature activation+1.05e-02

Pos logit contributions
 why+0.011
 seem+0.011
eling+0.010
 scr+0.010
 tale+0.010
Neg logit contributions
apped-0.026
amed-0.025
umbled-0.025
enge-0.025
ay-0.023
Token?"
Token ID1701
Feature activation+3.95e-03

Pos logit contributions
 why+0.011
 tale+0.011
land+0.011
eling+0.011
 scr+0.010
Neg logit contributions
apped-0.020
umbled-0.020
enge-0.020
amed-0.019
ay-0.019
Token's
Token ID338
Feature activation+2.10e-03

Pos logit contributions
eling+0.008
 tale+0.008
land+0.008
 why+0.008
 gn+0.008
Neg logit contributions
apped-0.011
ay-0.011
enge-0.011
umbled-0.010
amed-0.010
Token mom
Token ID1995
Feature activation+1.96e-03

Pos logit contributions
 why+0.003
 scr+0.002
 tale+0.002
 gn+0.002
 sweet+0.002
Neg logit contributions
umbled-0.004
amed-0.004
enge-0.004
apped-0.004
acked-0.003
Token said
Token ID531
Feature activation+1.13e-02

Pos logit contributions
eling+0.003
 tale+0.003
land+0.003
 why+0.003
itive+0.003
Neg logit contributions
apped-0.003
enge-0.003
ay-0.003
umbled-0.003
amed-0.003
Token,
Token ID11
Feature activation+1.92e-02

Pos logit contributions
eling+0.009
land+0.009
itive+0.009
 tale+0.008
 lav+0.008
Neg logit contributions
apped-0.025
enge-0.024
amed-0.024
ay-0.024
umbled-0.024
Token "
Token ID366
Feature activation+3.62e-02

Pos logit contributions
eling+0.037
land+0.032
 tale+0.032
itive+0.032
leness+0.031
Neg logit contributions
apped-0.026
ay-0.022
enge-0.021
amed-0.021
umbled-0.020
TokenWhy
Token ID5195
Feature activation+7.29e-02

Pos logit contributions
eling+0.045
land+0.043
itive+0.042
 tale+0.041
 lav+0.040
Neg logit contributions
apped-0.066
enge-0.062
ay-0.062
amed-0.061
umbled-0.060
Token don
Token ID836
Feature activation+5.58e-03

Pos logit contributions
eling+0.087
itive+0.086
 why+0.084
land+0.084
 tale+0.081
Neg logit contributions
apped-0.131
amed-0.129
enge-0.126
umbled-0.125
ay-0.121
Token't
Token ID470
Feature activation-1.62e-02

Pos logit contributions
eling+0.009
itive+0.009
 tale+0.009
land+0.009
 seem+0.009
Neg logit contributions
apped-0.007
enge-0.007
amed-0.007
umbled-0.007
ay-0.006
Token you
Token ID345
Feature activation+1.99e-03

Pos logit contributions
apped+0.023
enge+0.022
ay+0.021
umbled+0.021
amed+0.020
Neg logit contributions
eling-0.026
land-0.026
 tale-0.025
itive-0.025
 lav-0.024
Token cl
Token ID537
Feature activation-2.22e-02

Pos logit contributions
 scr+0.002
 seem+0.002
 why+0.002
eling+0.002
 tale+0.002
Neg logit contributions
apped-0.004
amed-0.004
umbled-0.004
enge-0.004
ase-0.004
Tokenap
Token ID499
Feature activation+2.92e-03

Pos logit contributions
amed+0.032
enge+0.031
umbled+0.030
apped+0.029
ied+0.029
Neg logit contributions
eling-0.030
 tale-0.030
 scr-0.029
 why-0.029
uf-0.027
Tokenmy
Token ID1820
Feature activation+2.18e-02

Pos logit contributions
apped+0.003
enge+0.003
ay+0.003
umbled+0.003
amed+0.003
Neg logit contributions
eling-0.003
 tale-0.003
land-0.003
itive-0.003
 why-0.003
Token asked
Token ID1965
Feature activation+2.16e-02

Pos logit contributions
eling+0.034
land+0.033
itive+0.032
 tale+0.032
 lav+0.031
Neg logit contributions
apped-0.032
enge-0.030
amed-0.030
ay-0.030
umbled-0.029
Token her
Token ID607
Feature activation+8.89e-03

Pos logit contributions
land+0.035
eling+0.035
itive+0.034
 tale+0.034
 gn+0.033
Neg logit contributions
apped-0.029
ay-0.027
enge-0.026
amed-0.025
ied-0.025
Token,
Token ID11
Feature activation+1.65e-02

Pos logit contributions
eling+0.013
land+0.011
itive+0.011
 lookout+0.011
leness+0.011
Neg logit contributions
apped-0.015
 of-0.013
ied-0.013
box-0.013
ay-0.013
Token "
Token ID366
Feature activation+3.32e-02

Pos logit contributions
eling+0.034
land+0.034
 tale+0.033
leness+0.032
itive+0.031
Neg logit contributions
apped-0.017
amed-0.014
ay-0.013
 of-0.013
enge-0.012
TokenWhy
Token ID5195
Feature activation+7.28e-02

Pos logit contributions
land+0.028
 tale+0.027
eling+0.026
itive+0.026
 hesitation+0.025
Neg logit contributions
apped-0.069
amed-0.069
enge-0.068
umbled-0.066
ay-0.065
Token are
Token ID389
Feature activation-9.05e-03

Pos logit contributions
eling+0.094
itive+0.094
 why+0.093
 seem+0.091
land+0.087
Neg logit contributions
apped-0.123
amed-0.122
enge-0.119
umbled-0.117
ay-0.114
Token you
Token ID345
Feature activation+9.53e-03

Pos logit contributions
apped+0.013
enge+0.013
ay+0.012
amed+0.012
umbled+0.012
Neg logit contributions
eling-0.014
land-0.014
itive-0.013
 tale-0.013
 lav-0.013
Token upset
Token ID9247
Feature activation+3.19e-03

Pos logit contributions
eling+0.013
land+0.013
 tale+0.012
itive+0.012
 lav+0.012
Neg logit contributions
apped-0.016
enge-0.015
amed-0.015
ay-0.015
umbled-0.015
Token,
Token ID11
Feature activation+3.65e-03

Pos logit contributions
eling+0.003
land+0.002
itive+0.002
 tale+0.002
 why+0.002
Neg logit contributions
apped-0.007
enge-0.007
amed-0.007
umbled-0.007
ay-0.007
Token Lily
Token ID20037
Feature activation+9.93e-03

Pos logit contributions
eling+0.005
land+0.004
itive+0.004
 tale+0.004
 lav+0.004
Neg logit contributions
apped-0.007
enge-0.006
amed-0.006
ay-0.006
umbled-0.006
Token.
Token ID13
Feature activation+1.28e-02

Pos logit contributions
eling+0.035
 gn+0.030
 tul+0.029
 tale+0.029
 lav+0.028
Neg logit contributions
apped-0.033
ay-0.027
umbled-0.026
 of-0.025
 owner-0.024
Token Her
Token ID2332
Feature activation+2.84e-02

Pos logit contributions
eling+0.019
itive+0.018
 tale+0.018
land+0.018
 hesitation+0.017
Neg logit contributions
apped-0.021
enge-0.019
umbled-0.019
ay-0.019
amed-0.019
Token mom
Token ID1995
Feature activation-4.87e-03

Pos logit contributions
 scr+0.037
 gn+0.037
 why+0.037
 hesitation+0.036
 wishing+0.036
Neg logit contributions
apped-0.048
enge-0.047
amed-0.047
umbled-0.044
ay-0.044
Token asked
Token ID1965
Feature activation+1.93e-02

Pos logit contributions
apped+0.008
enge+0.008
ay+0.007
amed+0.007
umbled+0.007
Neg logit contributions
eling-0.007
land-0.006
 tale-0.006
itive-0.006
 why-0.006
Token her
Token ID607
Feature activation+6.29e-03

Pos logit contributions
land+0.030
 tale+0.030
eling+0.030
itive+0.030
 gn+0.028
Neg logit contributions
apped-0.027
ay-0.025
enge-0.024
ied-0.023
amed-0.023
Token why
Token ID1521
Feature activation+7.28e-02

Pos logit contributions
eling+0.010
itive+0.009
land+0.009
 tale+0.008
 scr+0.008
Neg logit contributions
apped-0.010
ay-0.009
enge-0.009
amed-0.009
box-0.009
Token she
Token ID673
Feature activation+1.09e-03

Pos logit contributions
eling+0.072
 tale+0.071
land+0.071
itive+0.067
 tul+0.064
Neg logit contributions
apped-0.145
enge-0.138
ay-0.137
umbled-0.136
amed-0.133
Token was
Token ID373
Feature activation+1.28e-02

Pos logit contributions
eling+0.002
land+0.002
itive+0.002
 tale+0.002
 lav+0.002
Neg logit contributions
apped-0.002
enge-0.002
amed-0.001
ay-0.001
umbled-0.001
Token crying
Token ID13774
Feature activation+2.98e-02

Pos logit contributions
land+0.021
 tale+0.020
eling+0.020
 lav+0.019
 tul+0.019
Neg logit contributions
apped-0.017
amed-0.016
umbled-0.015
enge-0.015
box-0.015
Token and
Token ID290
Feature activation+1.64e-02

Pos logit contributions
eling+0.045
 tul+0.043
land+0.042
 gn+0.042
 tale+0.042
Neg logit contributions
apped-0.047
ay-0.044
umbled-0.042
enge-0.041
amed-0.041
Token Lily
Token ID20037
Feature activation+2.22e-02

Pos logit contributions
land+0.019
eling+0.019
itive+0.018
 tale+0.018
 lav+0.017
Neg logit contributions
apped-0.031
enge-0.029
amed-0.029
umbled-0.028
ay-0.028
TokenThe
Token ID464
Feature activation+2.23e-02

Pos logit contributions
eling+0.028
land+0.027
itive+0.026
 tale+0.026
 hesitation+0.025
Neg logit contributions
apped-0.040
enge-0.038
amed-0.038
ay-0.038
umbled-0.037
Token mother
Token ID2802
Feature activation-2.83e-03

Pos logit contributions
eling+0.033
itive+0.033
 tale+0.032
 scr+0.032
land+0.032
Neg logit contributions
apped-0.034
enge-0.032
umbled-0.032
amed-0.031
ay-0.031
Token asked
Token ID1965
Feature activation+8.72e-03

Pos logit contributions
apped+0.005
enge+0.005
ay+0.005
umbled+0.005
amed+0.005
Neg logit contributions
eling-0.004
 why-0.003
 tale-0.003
land-0.003
itive-0.003
Token:
Token ID25
Feature activation+2.40e-02

Pos logit contributions
eling+0.011
itive+0.010
 tale+0.010
land+0.010
 gn+0.009
Neg logit contributions
apped-0.016
ay-0.015
enge-0.015
amed-0.015
umbled-0.015
Token "
Token ID366
Feature activation+2.78e-02

Pos logit contributions
eling+0.052
 tale+0.048
itive+0.048
land+0.046
leness+0.044
Neg logit contributions
apped-0.030
ay-0.022
amed-0.022
umbled-0.021
enge-0.021
TokenWhy
Token ID5195
Feature activation+7.27e-02

Pos logit contributions
 tale+0.026
itive+0.025
land+0.025
eling+0.025
 hesitation+0.024
Neg logit contributions
apped-0.055
enge-0.055
amed-0.054
umbled-0.053
ase-0.051
Token are
Token ID389
Feature activation-5.37e-03

Pos logit contributions
eling+0.090
land+0.087
itive+0.086
 tale+0.085
 lav+0.082
Neg logit contributions
apped-0.129
enge-0.124
amed-0.122
ay-0.121
umbled-0.120
Token you
Token ID345
Feature activation+9.48e-03

Pos logit contributions
apped+0.008
enge+0.008
amed+0.007
ay+0.007
umbled+0.007
Neg logit contributions
eling-0.008
land-0.008
itive-0.008
 tale-0.008
 lav-0.008
Token sad
Token ID6507
Feature activation+2.39e-02

Pos logit contributions
eling+0.013
land+0.013
 tale+0.012
itive+0.012
 lav+0.012
Neg logit contributions
apped-0.016
amed-0.015
ay-0.015
umbled-0.015
enge-0.015
Token?"
Token ID1701
Feature activation+1.71e-02

Pos logit contributions
eling+0.028
land+0.026
 tale+0.026
itive+0.025
 gn+0.025
Neg logit contributions
apped-0.045
ay-0.043
enge-0.042
umbled-0.041
amed-0.041
Token
Token ID198
Feature activation+1.66e-02

Pos logit contributions
eling+0.033
itive+0.031
 tale+0.029
land+0.029
 hesitation+0.028
Neg logit contributions
apped-0.022
ay-0.021
enge-0.019
umbled-0.019
ase-0.018
Token.
Token ID13
Feature activation+1.68e-03

Pos logit contributions
apped+0.057
amed+0.057
enge+0.056
ay+0.056
umbled+0.055
Neg logit contributions
land-0.027
 why-0.026
 tale-0.025
itive-0.024
eling-0.023
Token They
Token ID1119
Feature activation+5.45e-03

Pos logit contributions
eling+0.003
itive+0.003
land+0.003
 tale+0.002
 hesitation+0.002
Neg logit contributions
apped-0.003
enge-0.002
ay-0.002
amed-0.002
umbled-0.002
Token hoped
Token ID10719
Feature activation-1.43e-02

Pos logit contributions
eling+0.007
 tale+0.007
 why+0.007
 gn+0.007
 scr+0.007
Neg logit contributions
apped-0.009
ay-0.009
umbled-0.009
enge-0.009
amed-0.009
Token the
Token ID262
Feature activation-2.15e-02

Pos logit contributions
amed+0.029
apped+0.029
enge+0.029
ay+0.028
umbled+0.028
Neg logit contributions
 why-0.015
land-0.015
eling-0.013
 what-0.013
 lav-0.013
Token k
Token ID479
Feature activation-3.72e-02

Pos logit contributions
apped+0.032
ay+0.031
enge+0.030
amed+0.030
umbled+0.030
Neg logit contributions
eling-0.032
itive-0.031
land-0.031
 tale-0.030
 gn-0.029
Tokenang
Token ID648
Feature activation-8.28e-02

Pos logit contributions
enge+0.063
amed+0.063
apped+0.063
box+0.060
ay+0.058
Neg logit contributions
eling-0.045
itive-0.041
 tale-0.041
land-0.041
 tul-0.040
Tokenaroo
Token ID38049
Feature activation-3.77e-02

Pos logit contributions
enge+0.110
apped+0.109
amed+0.108
ay+0.105
umbled+0.098
Neg logit contributions
itive-0.134
eling-0.133
land-0.128
 why-0.127
 tale-0.124
Token would
Token ID561
Feature activation-2.60e-02

Pos logit contributions
apped+0.059
enge+0.057
ay+0.056
amed+0.055
umbled+0.054
Neg logit contributions
eling-0.055
land-0.054
 tale-0.053
itive-0.052
 why-0.051
Token forgive
Token ID20927
Feature activation-2.55e-02

Pos logit contributions
apped+0.041
enge+0.039
ay+0.038
amed+0.038
umbled+0.037
Neg logit contributions
eling-0.036
itive-0.035
land-0.034
 tale-0.033
 lav-0.032
Token them
Token ID606
Feature activation-4.60e-03

Pos logit contributions
apped+0.042
enge+0.040
amed+0.040
ay+0.039
umbled+0.039
Neg logit contributions
eling-0.035
land-0.034
 tale-0.033
itive-0.033
 lav-0.032
Token and
Token ID290
Feature activation-7.30e-03

Pos logit contributions
apped+0.009
ay+0.008
enge+0.008
amed+0.008
umbled+0.008
Neg logit contributions
eling-0.005
itive-0.005
 lav-0.005
land-0.005
 tale-0.005
Token shake
Token ID13279
Feature activation+2.39e-02

Pos logit contributions
apped+0.009
umbled+0.009
ay+0.009
amed+0.009
enge+0.008
Neg logit contributions
eling-0.007
uf-0.006
 tale-0.006
itive-0.006
 lav-0.006
Token.
Token ID13
Feature activation-6.56e-03

Pos logit contributions
eling+0.030
 tale+0.028
itive+0.028
land+0.027
 lav+0.027
Neg logit contributions
apped-0.042
ay-0.041
enge-0.040
umbled-0.039
amed-0.039
Token Everywhere
Token ID48201
Feature activation+7.05e-03

Pos logit contributions
apped+0.011
enge+0.011
amed+0.011
ay+0.011
umbled+0.010
Neg logit contributions
land-0.008
eling-0.008
 gn-0.008
 tale-0.008
 why-0.008
Token the
Token ID262
Feature activation-1.44e-02

Pos logit contributions
eling+0.005
 why+0.005
land+0.004
 tale+0.004
itive+0.004
Neg logit contributions
apped-0.016
amed-0.016
enge-0.016
umbled-0.016
ay-0.016
Token k
Token ID479
Feature activation-3.58e-02

Pos logit contributions
apped+0.025
enge+0.024
amed+0.023
umbled+0.023
ay+0.023
Neg logit contributions
eling-0.019
land-0.018
 tale-0.018
itive-0.018
 gn-0.018
Tokenang
Token ID648
Feature activation-8.24e-02

Pos logit contributions
amed+0.046
enge+0.046
ay+0.044
box+0.043
apped+0.042
Neg logit contributions
eling-0.054
Far-0.052
 tale-0.050
itive-0.050
 gn-0.049
Tokenaroo
Token ID38049
Feature activation-3.30e-02

Pos logit contributions
enge+0.111
amed+0.108
ay+0.108
apped+0.104
box+0.099
Neg logit contributions
eling-0.136
itive-0.131
land-0.124
 why-0.123
 tale-0.118
Token went
Token ID1816
Feature activation-6.18e-03

Pos logit contributions
apped+0.055
enge+0.055
ay+0.055
amed+0.053
umbled+0.052
Neg logit contributions
eling-0.044
land-0.041
 tale-0.041
itive-0.040
 gn-0.040
Token,
Token ID11
Feature activation-1.44e-03

Pos logit contributions
apped+0.014
enge+0.013
amed+0.013
ay+0.013
umbled+0.013
Neg logit contributions
eling-0.005
land-0.005
 tale-0.005
itive-0.005
 why-0.005
Token change
Token ID1487
Feature activation+3.16e-03

Pos logit contributions
apped+0.003
enge+0.003
ay+0.003
amed+0.003
umbled+0.003
Neg logit contributions
eling-0.001
land-0.001
itive-0.001
 why-0.001
 tale-0.001
Token happened
Token ID3022
Feature activation+1.77e-02

Pos logit contributions
eling+0.004
land+0.004
 tale+0.004
itive+0.004
 lav+0.004
Neg logit contributions
apped-0.005
enge-0.005
ay-0.005
amed-0.005
umbled-0.005
Token little
Token ID1310
Feature activation+6.96e-03

Pos logit contributions
apped+0.020
ay+0.020
enge+0.019
umbled+0.018
amed+0.018
Neg logit contributions
eling-0.018
itive-0.017
land-0.017
 tale-0.017
 lav-0.016
Token boys
Token ID6510
Feature activation-1.15e-02

Pos logit contributions
eling+0.011
 scr+0.011
land+0.011
zing+0.011
 lav+0.010
Neg logit contributions
apped-0.009
ay-0.009
ite-0.009
enge-0.009
umbled-0.008
Token,"
Token ID553
Feature activation+1.37e-03

Pos logit contributions
apped+0.022
enge+0.021
ay+0.021
amed+0.020
umbled+0.020
Neg logit contributions
eling-0.013
land-0.012
itive-0.012
 lav-0.012
 tale-0.011
Token the
Token ID262
Feature activation-4.52e-03

Pos logit contributions
land+0.002
eling+0.002
itive+0.002
 tale+0.002
zing+0.002
Neg logit contributions
apped-0.002
enge-0.002
ay-0.002
umbled-0.002
amed-0.002
Token k
Token ID479
Feature activation-3.44e-02

Pos logit contributions
apped+0.007
enge+0.007
ay+0.007
amed+0.007
umbled+0.007
Neg logit contributions
eling-0.006
land-0.006
itive-0.006
 tale-0.006
 lav-0.006
Tokenang
Token ID648
Feature activation-8.16e-02

Pos logit contributions
amed+0.055
apped+0.055
enge+0.054
umbled+0.051
ay+0.051
Neg logit contributions
eling-0.045
 tale-0.042
 tul-0.041
land-0.041
itive-0.041
Tokenaroo
Token ID38049
Feature activation-3.02e-02

Pos logit contributions
apped+0.096
enge+0.093
amed+0.092
ay+0.091
umbled+0.087
Neg logit contributions
eling-0.147
itive-0.142
land-0.140
 why-0.138
 tale-0.138
Token says
Token ID1139
Feature activation-2.79e-02

Pos logit contributions
apped+0.037
enge+0.035
ay+0.033
amed+0.032
umbled+0.031
Neg logit contributions
itive-0.054
eling-0.054
land-0.052
 tale-0.051
 hesitation-0.050
Token.
Token ID13
Feature activation-1.31e-02

Pos logit contributions
apped+0.059
amed+0.053
ay+0.053
enge+0.053
umbled+0.053
Neg logit contributions
eling-0.029
 tale-0.026
itive-0.026
land-0.026
 lav-0.025
Token "
Token ID366
Feature activation-5.13e-03

Pos logit contributions
apped+0.021
enge+0.019
umbled+0.019
ay+0.019
amed+0.018
Neg logit contributions
eling-0.021
 tale-0.019
itive-0.019
land-0.019
 tul-0.017
TokenI
Token ID40
Feature activation+7.47e-03

Pos logit contributions
apped+0.008
enge+0.008
amed+0.008
umbled+0.007
ay+0.007
Neg logit contributions
eling-0.007
land-0.007
 tale-0.007
itive-0.007
 lav-0.007
Token be
Token ID307
Feature activation+1.39e-02

Pos logit contributions
eling+0.035
itive+0.035
land+0.033
cel+0.033
 cuc+0.033
Neg logit contributions
apped-0.028
enge-0.028
amed-0.027
ay-0.027
 losers-0.026
Token careful
Token ID8161
Feature activation-8.09e-03

Pos logit contributions
eling+0.017
land+0.017
 tale+0.016
itive+0.016
 why+0.016
Neg logit contributions
apped-0.025
enge-0.024
ay-0.024
umbled-0.023
amed-0.023
Token.
Token ID13
Feature activation-3.97e-03

Pos logit contributions
apped+0.016
ay+0.015
amed+0.015
enge+0.015
umbled+0.015
Neg logit contributions
eling-0.009
land-0.008
 tale-0.008
itive-0.008
 lav-0.008
Token The
Token ID383
Feature activation-1.67e-03

Pos logit contributions
apped+0.006
enge+0.006
amed+0.006
ay+0.006
umbled+0.005
Neg logit contributions
eling-0.006
land-0.006
 tale-0.006
itive-0.006
 lav-0.006
Token k
Token ID479
Feature activation-3.21e-02

Pos logit contributions
apped+0.003
enge+0.003
amed+0.003
umbled+0.003
ay+0.003
Neg logit contributions
eling-0.002
land-0.002
 tale-0.002
itive-0.002
 lav-0.002
Tokenang
Token ID648
Feature activation-8.14e-02

Pos logit contributions
amed+0.058
enge+0.056
apped+0.056
box+0.053
ied+0.053
Neg logit contributions
eling-0.036
 tale-0.033
itive-0.031
land-0.031
 tul-0.031
Tokenaroo
Token ID38049
Feature activation-2.14e-02

Pos logit contributions
apped+0.106
amed+0.105
enge+0.104
ay+0.100
umbled+0.097
Neg logit contributions
eling-0.135
itive-0.132
 why-0.128
land-0.126
 tale-0.126
Token is
Token ID318
Feature activation+3.88e-03

Pos logit contributions
apped+0.030
enge+0.029
amed+0.028
ay+0.028
umbled+0.028
Neg logit contributions
eling-0.035
 tale-0.033
land-0.033
itive-0.032
 why-0.032
Token not
Token ID407
Feature activation+2.48e-03

Pos logit contributions
eling+0.005
land+0.005
 tale+0.004
itive+0.004
 lav+0.004
Neg logit contributions
apped-0.007
enge-0.007
ay-0.007
amed-0.007
umbled-0.007
Token a
Token ID257
Feature activation-1.19e-03

Pos logit contributions
eling+0.003
land+0.003
itive+0.003
 tale+0.003
 gn+0.003
Neg logit contributions
apped-0.004
enge-0.004
amed-0.004
ay-0.004
umbled-0.004
Token toy
Token ID13373
Feature activation-3.22e-02

Pos logit contributions
apped+0.002
enge+0.002
ay+0.002
amed+0.002
umbled+0.002
Neg logit contributions
eling-0.002
land-0.002
itive-0.002
 tale-0.001
 lav-0.001
Tokenang
Token ID648
Feature activation-5.95e-02

Pos logit contributions
apped+0.033
amed+0.032
enge+0.031
umbled+0.029
box+0.028
Neg logit contributions
eling-0.041
 tale-0.041
itive-0.039
 tul-0.039
land-0.038
Tokenaroo
Token ID38049
Feature activation-6.84e-03

Pos logit contributions
apped+0.075
amed+0.073
enge+0.073
umbled+0.069
ay+0.069
Neg logit contributions
eling-0.103
itive-0.102
 tale-0.096
 why-0.095
land-0.095
Token!"
Token ID2474
Feature activation-1.60e-02

Pos logit contributions
apped+0.014
enge+0.013
amed+0.013
umbled+0.013
ay+0.013
Neg logit contributions
eling-0.007
land-0.007
 tale-0.007
itive-0.007
 why-0.006
Token The
Token ID383
Feature activation+5.97e-03

Pos logit contributions
apped+0.026
umbled+0.022
ay+0.022
amed+0.022
enge+0.021
Neg logit contributions
eling-0.027
itive-0.025
 hesitation-0.024
 tale-0.024
 gn-0.023
Token k
Token ID479
Feature activation-3.02e-02

Pos logit contributions
eling+0.008
land+0.008
itive+0.008
 tale+0.008
 lav+0.007
Neg logit contributions
apped-0.010
enge-0.010
amed-0.009
ay-0.009
umbled-0.009
Tokenang
Token ID648
Feature activation-8.05e-02

Pos logit contributions
apped+0.050
enge+0.049
amed+0.049
box+0.048
umbled+0.047
Neg logit contributions
eling-0.036
 tul-0.035
 tale-0.035
itive-0.033
 lav-0.032
Tokenaroo
Token ID38049
Feature activation-2.52e-02

Pos logit contributions
enge+0.105
apped+0.104
amed+0.102
ay+0.102
box+0.097
Neg logit contributions
eling-0.130
itive-0.128
 why-0.122
land-0.120
 tale-0.118
Token sque
Token ID35776
Feature activation-2.09e-02

Pos logit contributions
apped+0.043
enge+0.041
amed+0.041
ay+0.040
umbled+0.040
Neg logit contributions
eling-0.033
land-0.031
itive-0.031
 tale-0.031
 lav-0.030
Tokenaked
Token ID4335
Feature activation+1.52e-02

Pos logit contributions
enge+0.015
ay+0.015
apped+0.013
case+0.013
ane+0.013
Neg logit contributions
eling-0.045
 tale-0.043
itive-0.042
land-0.041
 lav-0.041
Token in
Token ID287
Feature activation+7.44e-03

Pos logit contributions
eling+0.020
land+0.020
itive+0.020
 tale+0.020
 lav+0.019
Neg logit contributions
apped-0.025
enge-0.024
ay-0.023
amed-0.023
umbled-0.023
Token response
Token ID2882
Feature activation+3.33e-03

Pos logit contributions
eling+0.012
land+0.011
itive+0.011
itionally+0.011
 tale+0.011
Neg logit contributions
ay-0.011
apped-0.011
amed-0.010
enge-0.009
umbled-0.009
TokenHello
Token ID15496
Feature activation+2.41e-02

Pos logit contributions
eling+0.008
itive+0.008
 tale+0.008
land+0.008
 why+0.008
Neg logit contributions
apped-0.007
ay-0.007
enge-0.007
amed-0.007
umbled-0.007
Token friend
Token ID1545
Feature activation-3.02e-02

Pos logit contributions
eling+0.035
leness+0.032
itive+0.032
 lookout+0.032
 lav+0.031
Neg logit contributions
ay-0.040
apped-0.038
enge-0.036
ite-0.036
ane-0.035
Token!"
Token ID2474
Feature activation-1.36e-02

Pos logit contributions
apped+0.058
enge+0.055
ay+0.055
amed+0.054
umbled+0.054
Neg logit contributions
eling-0.034
land-0.031
itive-0.031
 tale-0.031
 lav-0.030
Token The
Token ID383
Feature activation+5.30e-03

Pos logit contributions
apped+0.023
umbled+0.020
ay+0.020
enge+0.020
amed+0.020
Neg logit contributions
eling-0.021
itive-0.020
 tale-0.019
land-0.019
 hesitation-0.018
Token k
Token ID479
Feature activation-2.90e-02

Pos logit contributions
eling+0.007
land+0.006
 tale+0.006
itive+0.006
 tul+0.006
Neg logit contributions
apped-0.009
enge-0.009
amed-0.009
umbled-0.009
ay-0.009
Tokenang
Token ID648
Feature activation-7.90e-02

Pos logit contributions
apped+0.049
enge+0.049
amed+0.048
box+0.046
ied+0.046
Neg logit contributions
eling-0.033
 tul-0.032
 tale-0.032
itive-0.030
land-0.030
Tokenaroo
Token ID38049
Feature activation-2.37e-02

Pos logit contributions
enge+0.103
apped+0.102
amed+0.099
ay+0.099
umbled+0.093
Neg logit contributions
eling-0.128
itive-0.126
 why-0.121
land-0.121
 tale-0.118
Token sque
Token ID35776
Feature activation-2.02e-02

Pos logit contributions
apped+0.039
enge+0.037
amed+0.036
ay+0.036
umbled+0.035
Neg logit contributions
eling-0.033
land-0.032
itive-0.031
 tale-0.031
 lav-0.030
Tokenaked
Token ID4335
Feature activation+1.74e-02

Pos logit contributions
enge+0.015
ay+0.014
apped+0.013
case+0.013
ane+0.013
Neg logit contributions
eling-0.044
itive-0.042
 tale-0.041
land-0.041
 hesitation-0.040
Token happily
Token ID18177
Feature activation-1.13e-02

Pos logit contributions
eling+0.024
land+0.023
itive+0.023
 tale+0.023
 lav+0.022
Neg logit contributions
apped-0.028
enge-0.027
amed-0.026
ay-0.026
umbled-0.026
Token in
Token ID287
Feature activation+1.36e-02

Pos logit contributions
apped+0.020
amed+0.019
ay+0.018
umbled+0.018
enge+0.018
Neg logit contributions
eling-0.015
itive-0.014
 lav-0.013
land-0.013
 tul-0.013
Token he
Token ID339
Feature activation-1.27e-02

Pos logit contributions
eling+0.018
land+0.017
 tale+0.017
itive+0.017
 tul+0.017
Neg logit contributions
apped-0.023
ay-0.021
enge-0.021
umbled-0.020
amed-0.020
Token could
Token ID714
Feature activation-6.31e-03

Pos logit contributions
apped+0.023
enge+0.022
ay+0.021
umbled+0.021
amed+0.020
Neg logit contributions
eling-0.017
 why-0.016
 tale-0.016
 seem-0.015
itive-0.015
Token do
Token ID466
Feature activation-3.82e-03

Pos logit contributions
apped+0.011
enge+0.010
ay+0.010
amed+0.010
umbled+0.010
Neg logit contributions
eling-0.008
land-0.008
itive-0.008
 tale-0.008
 lav-0.008
Token anything
Token ID1997
Feature activation+2.26e-02

Pos logit contributions
apped+0.006
amed+0.005
ay+0.005
enge+0.005
umbled+0.005
Neg logit contributions
 tul-0.005
itive-0.005
 tale-0.005
 lav-0.005
land-0.005
Token if
Token ID611
Feature activation+2.85e-03

Pos logit contributions
 tul+0.031
eling+0.031
itive+0.031
 lav+0.030
uf+0.030
Neg logit contributions
apped-0.032
enge-0.031
ay-0.031
amed-0.030
umbled-0.029
Tokenhe
Token ID258
Feature activation-7.85e-02

Pos logit contributions
eling+0.003
land+0.003
itive+0.003
 tale+0.003
 why+0.003
Neg logit contributions
apped-0.006
enge-0.006
amed-0.006
ay-0.005
umbled-0.005
Token set
Token ID900
Feature activation-2.44e-02

Pos logit contributions
apped+0.127
enge+0.120
ay+0.117
amed+0.116
umbled+0.116
Neg logit contributions
eling-0.109
land-0.107
 tale-0.105
itive-0.103
 lav-0.101
Token his
Token ID465
Feature activation-1.91e-02

Pos logit contributions
apped+0.037
ay+0.036
enge+0.035
amed+0.034
umbled+0.034
Neg logit contributions
eling-0.034
itive-0.034
 tale-0.034
land-0.034
 lav-0.033
Token mind
Token ID2000
Feature activation-2.26e-02

Pos logit contributions
apped+0.030
enge+0.029
amed+0.028
ay+0.028
umbled+0.028
Neg logit contributions
eling-0.027
land-0.027
 tale-0.026
itive-0.026
 lav-0.025
Token to
Token ID284
Feature activation+8.91e-03

Pos logit contributions
apped+0.038
enge+0.038
amed+0.036
ay+0.036
umbled+0.035
Neg logit contributions
eling-0.029
land-0.029
itive-0.028
 why-0.028
 tale-0.028
Token it
Token ID340
Feature activation-1.03e-02

Pos logit contributions
land+0.012
itive+0.012
eling+0.012
 tale+0.012
 lav+0.011
Neg logit contributions
apped-0.013
ay-0.013
enge-0.013
amed-0.012
umbled-0.012
Token says
Token ID1139
Feature activation-3.05e-02

Pos logit contributions
apped+0.028
enge+0.027
amed+0.027
ay+0.027
umbled+0.026
Neg logit contributions
eling-0.023
land-0.023
 tale-0.023
itive-0.022
 gn-0.022
Token.
Token ID13
Feature activation-1.67e-02

Pos logit contributions
apped+0.068
enge+0.066
ay+0.065
amed+0.065
umbled+0.064
Neg logit contributions
eling-0.024
land-0.023
itive-0.022
 tale-0.022
 lav-0.021
Token "
Token ID366
Feature activation-5.22e-03

Pos logit contributions
apped+0.030
enge+0.028
ay+0.027
amed+0.027
umbled+0.027
Neg logit contributions
eling-0.022
land-0.021
itive-0.020
 tale-0.020
 lav-0.019
TokenThe
Token ID464
Feature activation+2.50e-03

Pos logit contributions
apped+0.007
enge+0.007
amed+0.007
umbled+0.007
ay+0.007
Neg logit contributions
 tale-0.008
eling-0.008
land-0.008
 hesitation-0.008
itive-0.008
Token k
Token ID479
Feature activation-3.13e-02

Pos logit contributions
eling+0.003
land+0.003
 tale+0.003
itive+0.003
 lav+0.003
Neg logit contributions
apped-0.004
enge-0.004
amed-0.004
ay-0.004
umbled-0.004
Tokenang
Token ID648
Feature activation-7.85e-02

Pos logit contributions
amed+0.055
apped+0.054
enge+0.054
ied+0.052
box+0.051
Neg logit contributions
eling-0.035
 tale-0.034
 tul-0.033
land-0.032
itive-0.031
Tokenaroo
Token ID38049
Feature activation-1.96e-02

Pos logit contributions
apped+0.106
amed+0.105
enge+0.103
ay+0.099
umbled+0.097
Neg logit contributions
eling-0.127
itive-0.124
 why-0.120
 tale-0.118
land-0.118
Token is
Token ID318
Feature activation+5.59e-03

Pos logit contributions
apped+0.028
enge+0.026
amed+0.026
ay+0.025
umbled+0.025
Neg logit contributions
eling-0.032
land-0.031
 tale-0.030
itive-0.030
 why-0.029
Token not
Token ID407
Feature activation+7.70e-03

Pos logit contributions
eling+0.006
 tale+0.006
land+0.006
 why+0.006
 gn+0.006
Neg logit contributions
apped-0.011
enge-0.010
umbled-0.010
ay-0.010
amed-0.010
Token from
Token ID422
Feature activation-2.45e-02

Pos logit contributions
eling+0.010
land+0.009
itive+0.009
 tale+0.009
 lav+0.009
Neg logit contributions
apped-0.013
enge-0.013
amed-0.012
ay-0.012
umbled-0.012
Token the
Token ID262
Feature activation+1.28e-02

Pos logit contributions
apped+0.039
enge+0.037
amed+0.037
umbled+0.036
ay+0.036
Neg logit contributions
eling-0.036
land-0.035
 tale-0.034
itive-0.034
 why-0.033
Token,
Token ID11
Feature activation+5.68e-03

Pos logit contributions
apped+0.004
enge+0.004
ay+0.004
amed+0.004
umbled+0.004
Neg logit contributions
eling-0.002
land-0.002
 tale-0.002
itive-0.002
 lav-0.002
Token "
Token ID366
Feature activation+1.23e-02

Pos logit contributions
 tale+0.010
eling+0.010
land+0.010
itive+0.009
leness+0.009
Neg logit contributions
apped-0.008
amed-0.007
enge-0.006
ay-0.006
box-0.006
TokenWhat
Token ID2061
Feature activation+4.71e-02

Pos logit contributions
 tale+0.016
eling+0.015
itive+0.015
land+0.015
 hesitation+0.014
Neg logit contributions
enge-0.020
amed-0.020
apped-0.019
ay-0.019
umbled-0.018
Token does
Token ID857
Feature activation-5.42e-03

Pos logit contributions
land+0.062
eling+0.062
 tale+0.061
 why+0.061
itive+0.061
Neg logit contributions
apped-0.079
amed-0.078
umbled-0.076
enge-0.075
ay-0.073
Token '
Token ID705
Feature activation-1.67e-03

Pos logit contributions
ay+0.007
apped+0.007
enge+0.007
 losers+0.006
ane+0.006
Neg logit contributions
eling-0.009
land-0.009
itive-0.008
 symp-0.008
 tul-0.008
Tokenang
Token ID648
Feature activation-7.82e-02

Pos logit contributions
apped+0.003
enge+0.003
ay+0.003
amed+0.002
umbled+0.002
Neg logit contributions
eling-0.002
land-0.002
itive-0.002
 tale-0.002
 lav-0.002
Tokenry
Token ID563
Feature activation-1.42e-02

Pos logit contributions
apped+0.116
enge+0.112
amed+0.111
ay+0.108
umbled+0.107
Neg logit contributions
eling-0.120
itive-0.113
 why-0.111
 tale-0.110
land-0.109
Token'
Token ID6
Feature activation-4.02e-03

Pos logit contributions
apped+0.025
enge+0.024
amed+0.023
ay+0.023
umbled+0.023
Neg logit contributions
eling-0.018
land-0.017
itive-0.017
 tale-0.017
 lav-0.016
Token mean
Token ID1612
Feature activation-8.84e-03

Pos logit contributions
apped+0.007
enge+0.007
amed+0.007
ay+0.007
umbled+0.007
Neg logit contributions
eling-0.005
land-0.005
 tale-0.005
itive-0.005
 why-0.005
Token?".
Token ID43634
Feature activation+2.63e-03

Pos logit contributions
apped+0.017
enge+0.016
amed+0.016
umbled+0.015
ay+0.015
Neg logit contributions
eling-0.010
land-0.010
itive-0.010
 tale-0.010
 why-0.009
Token
Token ID198
Feature activation+4.77e-03

Pos logit contributions
eling+0.004
land+0.004
itive+0.004
 tale+0.004
 lav+0.003
Neg logit contributions
apped-0.004
enge-0.004
amed-0.004
ay-0.004
umbled-0.004
Token a
Token ID257
Feature activation-1.36e-02

Pos logit contributions
apped+0.002
ay+0.002
enge+0.002
amed+0.002
umbled+0.002
Neg logit contributions
eling-0.001
itive-0.001
 lav-0.001
 tale-0.001
land-0.001
Token story
Token ID1621
Feature activation+3.31e-02

Pos logit contributions
ay+0.021
apped+0.020
enge+0.020
ane+0.019
amed+0.018
Neg logit contributions
eling-0.021
itive-0.020
land-0.017
 hesitation-0.017
 scr-0.017
Token,"
Token ID553
Feature activation-1.76e-03

Pos logit contributions
land+0.030
 tale+0.028
 why+0.026
itive+0.025
eling+0.024
Neg logit contributions
apped-0.071
umbled-0.069
enge-0.069
amed-0.069
ay-0.065
Token the
Token ID262
Feature activation-5.32e-03

Pos logit contributions
apped+0.003
amed+0.003
umbled+0.003
enge+0.003
ay+0.003
Neg logit contributions
 gn-0.002
 why-0.002
eling-0.002
 tale-0.002
 lav-0.002
Token k
Token ID479
Feature activation-3.30e-02

Pos logit contributions
apped+0.009
enge+0.009
amed+0.009
ay+0.008
umbled+0.008
Neg logit contributions
eling-0.007
land-0.007
 tale-0.007
itive-0.007
 lav-0.006
Tokenang
Token ID648
Feature activation-7.80e-02

Pos logit contributions
amed+0.056
apped+0.055
enge+0.053
umbled+0.052
ay+0.051
Neg logit contributions
eling-0.040
 tale-0.039
 tul-0.038
land-0.037
itive-0.037
Tokenaroo
Token ID38049
Feature activation-2.63e-02

Pos logit contributions
apped+0.093
enge+0.091
amed+0.090
ay+0.089
umbled+0.085
Neg logit contributions
eling-0.138
itive-0.135
land-0.132
 why-0.130
 tale-0.129
Token says
Token ID1139
Feature activation-2.71e-02

Pos logit contributions
apped+0.035
enge+0.033
ay+0.032
amed+0.031
umbled+0.031
Neg logit contributions
eling-0.044
itive-0.043
land-0.043
 tale-0.042
 lav-0.041
Token.
Token ID13
Feature activation-2.07e-02

Pos logit contributions
apped+0.058
amed+0.053
enge+0.053
umbled+0.053
ay+0.053
Neg logit contributions
eling-0.027
land-0.024
itive-0.024
 tale-0.024
 lav-0.023
Token "
Token ID366
Feature activation-1.58e-03

Pos logit contributions
apped+0.036
enge+0.034
ay+0.034
amed+0.033
umbled+0.033
Neg logit contributions
eling-0.028
land-0.026
itive-0.026
 tale-0.026
 lav-0.024
TokenCome
Token ID16773
Feature activation+1.94e-02

Pos logit contributions
apped+0.003
enge+0.003
amed+0.002
ay+0.002
umbled+0.002
Neg logit contributions
eling-0.002
itive-0.002
land-0.002
 tale-0.002
 hesitation-0.002
Tokenang
Token ID648
Feature activation-6.92e-02

Pos logit contributions
apped+0.042
amed+0.041
enge+0.040
umbled+0.039
ay+0.037
Neg logit contributions
eling-0.048
 tale-0.048
itive-0.047
land-0.047
 tul-0.046
Tokenaroo
Token ID38049
Feature activation-1.60e-02

Pos logit contributions
apped+0.076
enge+0.071
amed+0.070
ay+0.068
umbled+0.068
Neg logit contributions
eling-0.132
itive-0.128
land-0.128
 tale-0.127
 why-0.124
Token!".
Token ID48220
Feature activation-7.62e-03

Pos logit contributions
apped+0.030
enge+0.029
ay+0.029
amed+0.028
umbled+0.028
Neg logit contributions
eling-0.018
land-0.017
itive-0.016
 hesitation-0.016
 lav-0.016
Token The
Token ID383
Feature activation+9.84e-03

Pos logit contributions
apped+0.011
ay+0.010
umbled+0.010
amed+0.009
enge+0.009
Neg logit contributions
eling-0.014
itive-0.013
 tale-0.012
 hesitation-0.012
 scr-0.011
Token k
Token ID479
Feature activation-2.90e-02

Pos logit contributions
eling+0.014
land+0.013
itive+0.013
 tale+0.013
 lav+0.012
Neg logit contributions
apped-0.016
enge-0.016
ay-0.015
amed-0.015
umbled-0.015
Tokenang
Token ID648
Feature activation-7.79e-02

Pos logit contributions
apped+0.047
amed+0.047
enge+0.046
box+0.046
umbled+0.045
Neg logit contributions
eling-0.034
 tale-0.033
 tul-0.033
itive-0.031
 lav-0.031
Tokenaroo
Token ID38049
Feature activation-2.32e-02

Pos logit contributions
enge+0.102
apped+0.101
amed+0.100
ay+0.100
umbled+0.094
Neg logit contributions
eling-0.124
itive-0.123
 why-0.119
land-0.115
 tale-0.112
Token stopped
Token ID5025
Feature activation-5.34e-03

Pos logit contributions
apped+0.038
enge+0.037
ay+0.036
amed+0.036
umbled+0.035
Neg logit contributions
eling-0.031
land-0.030
 tale-0.030
itive-0.030
 lav-0.029
Token jumping
Token ID14284
Feature activation+1.62e-02

Pos logit contributions
apped+0.008
umbled+0.007
amed+0.007
enge+0.007
ay+0.006
Neg logit contributions
land-0.009
eling-0.009
 tale-0.008
itive-0.008
mmm-0.008
Token and
Token ID290
Feature activation+1.91e-03

Pos logit contributions
land+0.018
eling+0.017
itive+0.017
 tale+0.017
 gn+0.017
Neg logit contributions
apped-0.030
enge-0.029
umbled-0.028
amed-0.028
ay-0.028
Token hopped
Token ID45230
Feature activation+2.08e-02

Pos logit contributions
land+0.003
eling+0.003
 tale+0.003
itive+0.003
 lav+0.003
Neg logit contributions
apped-0.003
enge-0.003
amed-0.003
umbled-0.002
ay-0.002
Tokenang
Token ID648
Feature activation-5.61e-02

Pos logit contributions
amed+0.029
enge+0.029
apped+0.029
umbled+0.027
box+0.026
Neg logit contributions
eling-0.027
 tul-0.026
land-0.026
 tale-0.026
 lav-0.025
Tokenaroo
Token ID38049
Feature activation-3.02e-03

Pos logit contributions
apped+0.066
enge+0.065
amed+0.064
ay+0.062
umbled+0.060
Neg logit contributions
eling-0.099
itive-0.098
land-0.098
 tale-0.095
 why-0.095
Token.
Token ID13
Feature activation+4.45e-04

Pos logit contributions
apped+0.006
enge+0.006
ay+0.006
amed+0.006
umbled+0.005
Neg logit contributions
eling-0.003
itive-0.003
land-0.003
 tale-0.003
 hesitation-0.003
Token The
Token ID383
Feature activation+7.87e-03

Pos logit contributions
eling+0.001
itive+0.001
 tale+0.001
land+0.001
 hesitation+0.001
Neg logit contributions
apped-0.001
ay-0.001
enge-0.001
amed-0.001
umbled-0.001
Token k
Token ID479
Feature activation-3.17e-02

Pos logit contributions
eling+0.012
land+0.011
itive+0.011
 tale+0.011
 why+0.011
Neg logit contributions
apped-0.012
enge-0.011
ay-0.011
amed-0.011
umbled-0.011
Tokenang
Token ID648
Feature activation-7.78e-02

Pos logit contributions
amed+0.052
enge+0.052
apped+0.052
box+0.049
umbled+0.048
Neg logit contributions
eling-0.040
 tul-0.039
 tale-0.038
land-0.038
 lav-0.036
Tokenaroo
Token ID38049
Feature activation-1.89e-02

Pos logit contributions
apped+0.097
enge+0.095
amed+0.094
ay+0.091
umbled+0.088
Neg logit contributions
eling-0.135
itive-0.129
land-0.129
 tale-0.127
 why-0.126
Token eats
Token ID25365
Feature activation+9.48e-03

Pos logit contributions
apped+0.030
enge+0.028
ay+0.027
amed+0.027
umbled+0.027
Neg logit contributions
itive-0.026
eling-0.026
land-0.025
 tale-0.024
 hesitation-0.024
Token them
Token ID606
Feature activation+1.38e-02

Pos logit contributions
eling+0.014
itive+0.014
 tale+0.013
land+0.013
 gn+0.012
Neg logit contributions
apped-0.015
enge-0.014
amed-0.013
ay-0.013
umbled-0.013
Token.
Token ID13
Feature activation-1.28e-03

Pos logit contributions
eling+0.018
itive+0.018
 tale+0.017
land+0.017
 hesitation+0.016
Neg logit contributions
apped-0.024
amed-0.021
enge-0.021
ay-0.021
umbled-0.021
Token It
Token ID632
Feature activation+2.52e-03

Pos logit contributions
apped+0.002
ay+0.001
enge+0.001
amed+0.001
umbled+0.001
Neg logit contributions
eling-0.002
itive-0.002
 tale-0.002
land-0.002
 hesitation-0.002
Token They
Token ID1119
Feature activation+1.21e-02

Pos logit contributions
eling+0.005
land+0.004
itive+0.004
 tale+0.004
 lav+0.004
Neg logit contributions
apped-0.004
enge-0.004
ay-0.004
amed-0.004
umbled-0.004
Token hoped
Token ID10719
Feature activation-9.15e-03

Pos logit contributions
eling+0.015
 scr+0.014
itive+0.014
 why+0.014
 tale+0.014
Neg logit contributions
apped-0.022
ay-0.021
enge-0.021
umbled-0.021
amed-0.020
Token the
Token ID262
Feature activation-1.82e-02

Pos logit contributions
apped+0.017
enge+0.017
amed+0.017
ay+0.017
umbled+0.016
Neg logit contributions
eling-0.010
land-0.010
 why-0.010
 tale-0.010
itive-0.010
Token baby
Token ID5156
Feature activation-3.94e-02

Pos logit contributions
apped+0.029
enge+0.027
ay+0.027
amed+0.027
umbled+0.026
Neg logit contributions
eling-0.027
land-0.026
itive-0.025
 tale-0.025
 lav-0.024
Token k
Token ID479
Feature activation-3.22e-02

Pos logit contributions
apped+0.069
enge+0.066
amed+0.066
ay+0.065
umbled+0.065
Neg logit contributions
eling-0.050
land-0.049
 tale-0.049
itive-0.048
 why-0.047
Tokenang
Token ID648
Feature activation-7.78e-02

Pos logit contributions
amed+0.057
enge+0.055
apped+0.055
box+0.053
ay+0.052
Neg logit contributions
eling-0.037
itive-0.035
 tale-0.032
Far-0.031
land-0.030
Tokenaroo
Token ID38049
Feature activation-3.43e-02

Pos logit contributions
amed+0.107
ay+0.106
enge+0.106
apped+0.105
ushes+0.097
Neg logit contributions
itive-0.124
eling-0.122
 why-0.114
land-0.112
 tale-0.108
Token would
Token ID561
Feature activation-2.00e-02

Pos logit contributions
apped+0.054
enge+0.052
ay+0.051
amed+0.051
umbled+0.050
Neg logit contributions
eling-0.050
land-0.048
 tale-0.048
itive-0.047
 why-0.046
Token be
Token ID307
Feature activation+6.61e-03

Pos logit contributions
apped+0.033
enge+0.031
ay+0.031
amed+0.031
umbled+0.030
Neg logit contributions
eling-0.027
itive-0.026
land-0.025
 tale-0.025
 lav-0.024
Token okay
Token ID8788
Feature activation+1.40e-02

Pos logit contributions
eling+0.008
land+0.007
 tale+0.007
itive+0.007
 why+0.007
Neg logit contributions
apped-0.012
enge-0.012
ay-0.012
umbled-0.012
amed-0.011
Token.
Token ID13
Feature activation-2.59e-03

Pos logit contributions
eling+0.015
 tale+0.013
land+0.013
 lav+0.013
itive+0.013
Neg logit contributions
apped-0.028
ay-0.026
umbled-0.026
enge-0.026
amed-0.026
Tokenang
Token ID648
Feature activation-6.00e-02

Pos logit contributions
amed+0.027
enge+0.026
apped+0.026
umbled+0.024
ied+0.024
Neg logit contributions
eling-0.020
 tale-0.020
land-0.019
Far-0.018
 tul-0.018
Tokenaroo
Token ID38049
Feature activation-6.78e-03

Pos logit contributions
amed+0.077
apped+0.075
enge+0.074
ay+0.073
umbled+0.070
Neg logit contributions
eling-0.099
itive-0.098
land-0.097
 why-0.096
 tale-0.094
Token.
Token ID13
Feature activation-2.84e-03

Pos logit contributions
apped+0.013
enge+0.012
ay+0.012
umbled+0.012
amed+0.011
Neg logit contributions
eling-0.008
itive-0.008
land-0.007
 tale-0.007
 lav-0.007
Token The
Token ID383
Feature activation+7.92e-03

Pos logit contributions
apped+0.004
enge+0.004
amed+0.004
umbled+0.004
ay+0.004
Neg logit contributions
land-0.004
eling-0.004
 tale-0.004
 lav-0.004
 why-0.004
Token k
Token ID479
Feature activation-2.82e-02

Pos logit contributions
eling+0.010
land+0.010
 tale+0.010
itive+0.010
 lav+0.009
Neg logit contributions
apped-0.014
enge-0.013
amed-0.013
ay-0.013
umbled-0.013
Tokenang
Token ID648
Feature activation-7.75e-02

Pos logit contributions
amed+0.051
apped+0.050
enge+0.050
box+0.048
ied+0.048
Neg logit contributions
eling-0.029
 tale-0.028
 tul-0.027
land-0.025
Far-0.025
Tokenaroo
Token ID38049
Feature activation-1.82e-02

Pos logit contributions
amed+0.108
apped+0.104
enge+0.103
ay+0.101
umbled+0.097
Neg logit contributions
eling-0.122
itive-0.116
 why-0.116
land-0.113
 tale-0.110
Token looks
Token ID3073
Feature activation+3.64e-02

Pos logit contributions
apped+0.029
enge+0.028
ay+0.027
amed+0.027
umbled+0.026
Neg logit contributions
itive-0.025
eling-0.024
land-0.024
 tale-0.023
 hesitation-0.023
Token at
Token ID379
Feature activation-1.64e-02

Pos logit contributions
eling+0.061
land+0.060
itive+0.059
 tale+0.059
 lav+0.057
Neg logit contributions
apped-0.049
enge-0.046
amed-0.045
ay-0.044
umbled-0.044
Token them
Token ID606
Feature activation+1.51e-02

Pos logit contributions
apped+0.028
enge+0.027
amed+0.027
umbled+0.026
ay+0.026
Neg logit contributions
eling-0.022
land-0.021
itive-0.020
 tale-0.020
 why-0.020
Token.
Token ID13
Feature activation-1.88e-03

Pos logit contributions
eling+0.020
itive+0.019
 tale+0.019
 lav+0.019
land+0.018
Neg logit contributions
apped-0.026
umbled-0.024
ay-0.024
amed-0.023
enge-0.022
Token y
Token ID331
Feature activation-2.57e-02

Pos logit contributions
eling+0.004
itive+0.004
land+0.004
 tale+0.003
 lookout+0.003
Neg logit contributions
apped-0.004
ay-0.004
 losers-0.004
enge-0.004
ite-0.003
Tokenum
Token ID388
Feature activation-1.97e-02

Pos logit contributions
ay+0.026
apped+0.024
umbled+0.023
enge+0.021
amed+0.019
Neg logit contributions
land-0.054
eling-0.053
itive-0.053
 tale-0.052
 gn-0.052
Token,"
Token ID553
Feature activation+7.93e-03

Pos logit contributions
ay+0.030
apped+0.029
ite+0.027
enge+0.027
ane+0.026
Neg logit contributions
eling-0.030
 lookout-0.029
itionally-0.028
 lav-0.028
 symp-0.027
Token the
Token ID262
Feature activation-2.78e-03

Pos logit contributions
eling+0.013
land+0.012
itive+0.012
 tale+0.012
 lav+0.012
Neg logit contributions
apped-0.011
enge-0.011
ay-0.010
amed-0.010
umbled-0.010
Token k
Token ID479
Feature activation-3.31e-02

Pos logit contributions
apped+0.005
enge+0.004
ay+0.004
amed+0.004
umbled+0.004
Neg logit contributions
eling-0.004
land-0.004
itive-0.004
 tale-0.004
 lav-0.004
Tokenang
Token ID648
Feature activation-7.73e-02

Pos logit contributions
amed+0.052
apped+0.052
enge+0.051
umbled+0.048
ay+0.048
Neg logit contributions
eling-0.045
land-0.042
 tul-0.042
 tale-0.042
itive-0.041
Tokenaroo
Token ID38049
Feature activation-3.09e-02

Pos logit contributions
apped+0.090
enge+0.087
amed+0.085
ay+0.084
umbled+0.081
Neg logit contributions
eling-0.140
itive-0.136
land-0.136
 tale-0.133
 why-0.131
Token says
Token ID1139
Feature activation-2.92e-02

Pos logit contributions
apped+0.038
enge+0.035
ay+0.033
amed+0.033
umbled+0.032
Neg logit contributions
itive-0.055
eling-0.055
land-0.053
 tale-0.052
 hesitation-0.051
Token.
Token ID13
Feature activation-1.52e-02

Pos logit contributions
apped+0.063
ay+0.057
enge+0.057
amed+0.057
umbled+0.057
Neg logit contributions
eling-0.030
itive-0.026
land-0.026
 tale-0.026
 lav-0.024
Token
Token ID198
Feature activation-5.97e-03

Pos logit contributions
apped+0.027
enge+0.025
ay+0.025
amed+0.025
umbled+0.024
Neg logit contributions
eling-0.020
itive-0.019
land-0.019
 tale-0.019
 hesitation-0.017
Token
Token ID198
Feature activation-1.86e-03

Pos logit contributions
apped+0.012
enge+0.011
amed+0.011
umbled+0.011
ay+0.011
Neg logit contributions
eling-0.006
land-0.006
 tale-0.006
itive-0.006
 lav-0.006
Token It
Token ID632
Feature activation+3.40e-03

Pos logit contributions
apped+0.008
enge+0.008
amed+0.008
ay+0.008
umbled+0.008
Neg logit contributions
eling-0.006
land-0.006
itive-0.006
 tale-0.006
 lav-0.005
Token became
Token ID2627
Feature activation+2.57e-02

Pos logit contributions
eling+0.005
land+0.005
 tale+0.005
itive+0.005
 why+0.005
Neg logit contributions
apped-0.005
enge-0.005
ay-0.005
umbled-0.005
amed-0.005
Token the
Token ID262
Feature activation+1.74e-04

Pos logit contributions
eling+0.033
land+0.032
itive+0.031
 tale+0.031
 lav+0.029
Neg logit contributions
apped-0.045
enge-0.043
amed-0.043
ay-0.042
umbled-0.042
Token happiest
Token ID49414
Feature activation+2.54e-03

Pos logit contributions
 tale+0.000
land+0.000
eling+0.000
itive+0.000
 tul+0.000
Neg logit contributions
apped-0.000
umbled-0.000
enge-0.000
amed-0.000
ay-0.000
Token books
Token ID3835
Feature activation-2.71e-02

Pos logit contributions
 tale+0.003
land+0.003
itive+0.003
 why+0.003
 gn+0.003
Neg logit contributions
apped-0.004
amed-0.004
enge-0.004
umbled-0.004
ay-0.004
Tokenhe
Token ID258
Feature activation-7.73e-02

Pos logit contributions
enge+0.041
apped+0.039
amed+0.038
umbled+0.037
ay+0.035
Neg logit contributions
land-0.041
 tale-0.039
eling-0.039
itive-0.038
 seem-0.038
Tokenlf
Token ID1652
Feature activation-1.46e-02

Pos logit contributions
apped+0.152
enge+0.152
amed+0.148
ay+0.144
umbled+0.142
Neg logit contributions
itive-0.085
eling-0.079
land-0.075
 tale-0.073
 why-0.073
Token in
Token ID287
Feature activation+2.63e-03

Pos logit contributions
enge+0.022
apped+0.021
ay+0.020
amed+0.020
umbled+0.020
Neg logit contributions
land-0.022
itive-0.021
 tale-0.021
 why-0.020
 which-0.020
Token the
Token ID262
Feature activation-1.07e-03

Pos logit contributions
eling+0.003
land+0.003
 tale+0.003
itive+0.003
 why+0.003
Neg logit contributions
apped-0.005
enge-0.005
amed-0.005
ay-0.004
umbled-0.004
Token world
Token ID995
Feature activation+1.23e-02

Pos logit contributions
apped+0.002
enge+0.002
umbled+0.002
amed+0.002
ay+0.001
Neg logit contributions
 tale-0.002
land-0.002
eling-0.002
 gn-0.001
itive-0.001
Token!
Token ID0
Feature activation-9.84e-03

Pos logit contributions
land+0.012
eling+0.011
 tale+0.011
itive+0.011
 why+0.010
Neg logit contributions
apped-0.026
enge-0.025
amed-0.024
umbled-0.024
ay-0.024
Tokenang
Token ID648
Feature activation-6.20e-02

Pos logit contributions
amed+0.039
enge+0.038
apped+0.038
box+0.037
ied+0.035
Neg logit contributions
eling-0.030
 tale-0.028
land-0.027
itive-0.027
 tul-0.027
Tokenaroo
Token ID38049
Feature activation-7.02e-03

Pos logit contributions
apped+0.074
enge+0.072
amed+0.072
ay+0.069
umbled+0.066
Neg logit contributions
eling-0.109
itive-0.109
land-0.106
 tale-0.104
 why-0.103
Token.
Token ID13
Feature activation+1.60e-03

Pos logit contributions
apped+0.014
enge+0.013
ay+0.013
umbled+0.013
amed+0.013
Neg logit contributions
eling-0.007
itive-0.006
land-0.006
 lav-0.006
 tale-0.006
Token The
Token ID383
Feature activation+4.36e-03

Pos logit contributions
eling+0.003
land+0.002
itive+0.002
 tale+0.002
 lav+0.002
Neg logit contributions
apped-0.002
enge-0.002
ay-0.002
amed-0.002
umbled-0.002
Token k
Token ID479
Feature activation-3.02e-02

Pos logit contributions
eling+0.005
land+0.005
 tale+0.005
itive+0.005
 lav+0.005
Neg logit contributions
apped-0.008
enge-0.007
amed-0.007
umbled-0.007
ay-0.007
Tokenang
Token ID648
Feature activation-7.71e-02

Pos logit contributions
amed+0.053
enge+0.052
apped+0.052
box+0.051
ied+0.050
Neg logit contributions
eling-0.034
 tale-0.032
 tul-0.032
land-0.030
itive-0.029
Tokenaroo
Token ID38049
Feature activation-1.74e-02

Pos logit contributions
apped+0.097
amed+0.097
enge+0.096
ay+0.092
umbled+0.089
Neg logit contributions
eling-0.130
itive-0.126
land-0.124
 why-0.122
 tale-0.122
Token watches
Token ID16860
Feature activation-4.01e-03

Pos logit contributions
apped+0.029
enge+0.027
ay+0.026
amed+0.026
umbled+0.026
Neg logit contributions
eling-0.024
itive-0.023
land-0.023
 tale-0.022
 lav-0.021
Token them
Token ID606
Feature activation-7.36e-04

Pos logit contributions
apped+0.006
enge+0.006
ay+0.006
amed+0.006
umbled+0.006
Neg logit contributions
eling-0.006
land-0.005
itive-0.005
 tale-0.005
 lav-0.005
Token.
Token ID13
Feature activation-2.57e-03

Pos logit contributions
apped+0.001
ay+0.001
umbled+0.001
amed+0.001
enge+0.001
Neg logit contributions
eling-0.001
itive-0.001
 lav-0.001
 tale-0.001
uf-0.001
Token He
Token ID679
Feature activation+5.35e-03

Pos logit contributions
apped+0.004
enge+0.003
ay+0.003
amed+0.003
umbled+0.003
Neg logit contributions
eling-0.004
itive-0.004
land-0.004
 tale-0.004
 hesitation-0.004
Token me
Token ID502
Feature activation-1.61e-03

Pos logit contributions
apped+0.031
enge+0.030
ay+0.029
amed+0.029
umbled+0.028
Neg logit contributions
eling-0.033
land-0.033
itive-0.032
 tale-0.032
 lav-0.031
Token first
Token ID717
Feature activation-6.42e-03

Pos logit contributions
apped+0.003
enge+0.003
amed+0.003
ay+0.003
umbled+0.003
Neg logit contributions
eling-0.002
land-0.002
itive-0.002
 tale-0.002
 lav-0.002
Token".
Token ID1911
Feature activation-1.69e-02

Pos logit contributions
apped+0.011
enge+0.011
ay+0.011
amed+0.011
umbled+0.010
Neg logit contributions
eling-0.008
land-0.008
 tale-0.007
itive-0.007
 lav-0.007
Token The
Token ID383
Feature activation+1.12e-02

Pos logit contributions
apped+0.031
enge+0.029
ay+0.029
amed+0.028
umbled+0.028
Neg logit contributions
eling-0.021
land-0.020
itive-0.020
 tale-0.019
 lav-0.019
Token k
Token ID479
Feature activation-2.70e-02

Pos logit contributions
land+0.018
eling+0.018
itive+0.018
 tale+0.017
 lav+0.017
Neg logit contributions
apped-0.015
ay-0.014
amed-0.014
enge-0.014
umbled-0.013
Tokenang
Token ID648
Feature activation-7.68e-02

Pos logit contributions
apped+0.045
amed+0.044
enge+0.044
umbled+0.042
box+0.042
Neg logit contributions
eling-0.033
 tale-0.031
 tul-0.030
itive-0.029
land-0.029
Tokenaroo
Token ID38049
Feature activation-2.42e-02

Pos logit contributions
apped+0.101
enge+0.100
amed+0.099
ay+0.099
umbled+0.091
Neg logit contributions
eling-0.125
itive-0.122
 why-0.118
land-0.116
 gn-0.114
Token promised
Token ID8072
Feature activation-3.84e-02

Pos logit contributions
apped+0.040
enge+0.039
ay+0.039
amed+0.038
umbled+0.037
Neg logit contributions
eling-0.032
land-0.031
itive-0.030
 tale-0.030
 scr-0.030
Token he
Token ID339
Feature activation-9.68e-03

Pos logit contributions
apped+0.051
enge+0.049
ay+0.048
umbled+0.048
amed+0.047
Neg logit contributions
eling-0.063
itive-0.060
 scr-0.059
 tul-0.059
 tale-0.058
Token would
Token ID561
Feature activation-9.89e-03

Pos logit contributions
apped+0.018
ay+0.018
enge+0.017
amed+0.017
umbled+0.017
Neg logit contributions
 why-0.011
 seem-0.011
land-0.011
eling-0.011
itive-0.010
Token,
Token ID11
Feature activation-9.64e-03

Pos logit contributions
apped+0.016
ay+0.015
enge+0.015
amed+0.015
umbled+0.015
Neg logit contributions
eling-0.014
itive-0.013
land-0.013
 tale-0.013
 lav-0.012
Token?"
Token ID1701
Feature activation-1.90e-03

Pos logit contributions
apped+0.003
umbled+0.003
enge+0.003
amed+0.003
ay+0.003
Neg logit contributions
 why-0.001
land-0.001
 tale-0.001
eling-0.001
itive-0.001
Token
Token ID198
Feature activation-1.23e-02

Pos logit contributions
apped+0.003
enge+0.003
ay+0.003
umbled+0.003
amed+0.003
Neg logit contributions
eling-0.003
itive-0.003
land-0.003
 tale-0.003
 scr-0.003
Token
Token ID198
Feature activation-4.32e-03

Pos logit contributions
apped+0.024
enge+0.023
amed+0.023
umbled+0.023
ay+0.022
Neg logit contributions
eling-0.013
land-0.013
 tale-0.012
itive-0.012
 lav-0.012
TokenThe
Token ID464
Feature activation+6.31e-03

Pos logit contributions
apped+0.008
enge+0.007
amed+0.007
ay+0.007
umbled+0.007
Neg logit contributions
eling-0.005
land-0.005
itive-0.005
 tale-0.005
 lav-0.005
Token k
Token ID479
Feature activation-2.72e-02

Pos logit contributions
eling+0.009
land+0.008
itive+0.008
 tale+0.008
 lav+0.008
Neg logit contributions
apped-0.010
enge-0.010
ay-0.010
amed-0.010
umbled-0.010
Tokenang
Token ID648
Feature activation-7.66e-02

Pos logit contributions
amed+0.047
apped+0.046
enge+0.046
box+0.044
umbled+0.044
Neg logit contributions
eling-0.031
 tale-0.030
 tul-0.029
land-0.028
itive-0.028
Tokenaroo
Token ID38049
Feature activation-2.05e-02

Pos logit contributions
apped+0.098
amed+0.097
enge+0.097
ay+0.093
umbled+0.090
Neg logit contributions
eling-0.127
itive-0.124
 why-0.120
land-0.120
 tale-0.119
Token nods
Token ID34833
Feature activation-4.30e-03

Pos logit contributions
apped+0.033
enge+0.032
ay+0.031
amed+0.031
umbled+0.030
Neg logit contributions
eling-0.028
itive-0.028
land-0.027
 tale-0.026
 hesitation-0.026
Token.
Token ID13
Feature activation-6.10e-03

Pos logit contributions
apped+0.008
amed+0.007
umbled+0.007
ay+0.007
enge+0.007
Neg logit contributions
eling-0.006
 tale-0.005
 tul-0.005
uf-0.005
land-0.005
Token It
Token ID632
Feature activation+1.49e-04

Pos logit contributions
apped+0.008
enge+0.007
amed+0.007
ay+0.007
umbled+0.007
Neg logit contributions
eling-0.011
land-0.010
itive-0.010
 tale-0.010
 lav-0.010
Token hops
Token ID29438
Feature activation+8.26e-03

Pos logit contributions
eling+0.000
land+0.000
itive+0.000
 tale+0.000
 lav+0.000
Neg logit contributions
apped-0.000
enge-0.000
amed-0.000
ay-0.000
umbled-0.000
Token She
Token ID1375
Feature activation-1.16e-03

Pos logit contributions
apped+0.014
amed+0.013
enge+0.013
ay+0.013
umbled+0.013
Neg logit contributions
land-0.010
 gn-0.010
 why-0.009
 lav-0.009
 tale-0.009
Token said
Token ID531
Feature activation-1.69e-02

Pos logit contributions
ay+0.002
umbled+0.002
apped+0.002
enge+0.002
amed+0.002
Neg logit contributions
 why-0.001
 scr-0.001
eling-0.001
 gn-0.001
 tale-0.001
Token,
Token ID11
Feature activation-4.87e-03

Pos logit contributions
apped+0.035
enge+0.033
ay+0.032
amed+0.032
umbled+0.032
Neg logit contributions
eling-0.017
land-0.016
itive-0.015
 tale-0.015
 lav-0.014
Token "
Token ID366
Feature activation+1.06e-02

Pos logit contributions
apped+0.007
amed+0.006
ay+0.006
enge+0.006
umbled+0.005
Neg logit contributions
eling-0.009
 tale-0.008
itive-0.008
land-0.008
 hesitation-0.007
TokenH
Token ID39
Feature activation-2.21e-02

Pos logit contributions
eling+0.014
land+0.014
 tale+0.013
itive+0.013
 lav+0.013
Neg logit contributions
apped-0.018
enge-0.017
amed-0.017
ay-0.017
umbled-0.017
Tokenang
Token ID648
Feature activation-7.65e-02

Pos logit contributions
apped+0.042
ay+0.040
enge+0.040
umbled+0.040
amed+0.039
Neg logit contributions
eling-0.025
land-0.024
 tale-0.024
itive-0.024
 lav-0.023
Token up
Token ID510
Feature activation-1.83e-02

Pos logit contributions
apped+0.112
enge+0.108
amed+0.107
umbled+0.106
ay+0.105
Neg logit contributions
eling-0.117
itive-0.112
land-0.109
 why-0.109
 tale-0.109
Token please
Token ID3387
Feature activation-2.13e-02

Pos logit contributions
apped+0.035
enge+0.033
ay+0.033
amed+0.032
umbled+0.032
Neg logit contributions
eling-0.021
land-0.020
itive-0.019
 tale-0.019
 lav-0.018
Token!"
Token ID2474
Feature activation-3.34e-03

Pos logit contributions
apped+0.038
ay+0.036
enge+0.036
amed+0.036
umbled+0.035
Neg logit contributions
eling-0.027
itive-0.025
land-0.025
 tale-0.024
 lav-0.024
Token Her
Token ID2332
Feature activation+1.72e-02

Pos logit contributions
apped+0.005
enge+0.005
umbled+0.005
ay+0.005
amed+0.005
Neg logit contributions
eling-0.005
itive-0.005
 tale-0.005
land-0.005
 hesitation-0.005
Token mother
Token ID2802
Feature activation-1.61e-02

Pos logit contributions
eling+0.031
land+0.030
itive+0.029
 tale+0.029
 lav+0.028
Neg logit contributions
apped-0.022
enge-0.020
ay-0.020
amed-0.020
umbled-0.019
Token found
Token ID1043
Feature activation-1.46e-02

Pos logit contributions
ay+0.019
apped+0.019
enge+0.018
umbled+0.018
 losers+0.017
Neg logit contributions
 gn-0.014
eling-0.014
land-0.014
 tale-0.013
 why-0.013
Token a
Token ID257
Feature activation-2.13e-02

Pos logit contributions
apped+0.031
ay+0.031
enge+0.031
umbled+0.030
ane+0.029
Neg logit contributions
 lav-0.012
 why-0.011
 tul-0.011
 gn-0.011
eling-0.011
Token teaspoon
Token ID22326
Feature activation-1.62e-02

Pos logit contributions
apped+0.035
enge+0.034
ay+0.033
amed+0.033
umbled+0.033
Neg logit contributions
eling-0.029
land-0.028
itive-0.027
 tale-0.027
 lav-0.026
Token in
Token ID287
Feature activation+1.45e-03

Pos logit contributions
apped+0.025
enge+0.024
ay+0.023
amed+0.023
umbled+0.023
Neg logit contributions
eling-0.024
land-0.023
itive-0.023
 tale-0.023
 lav-0.022
Token her
Token ID607
Feature activation+1.42e-02

Pos logit contributions
itive+0.002
land+0.002
 tale+0.002
eling+0.002
itionally+0.002
Neg logit contributions
ay-0.002
apped-0.002
amed-0.002
enge-0.002
umbled-0.002
Token kitchen
Token ID9592
Feature activation+4.87e-02

Pos logit contributions
 tale+0.020
 lav+0.020
land+0.020
eling+0.020
 tul+0.020
Neg logit contributions
apped-0.022
enge-0.021
umbled-0.021
amed-0.020
ay-0.020
Token drawer
Token ID33451
Feature activation+1.05e-03

Pos logit contributions
eling+0.053
 tale+0.047
itive+0.047
land+0.046
 hesitation+0.045
Neg logit contributions
apped-0.096
enge-0.090
ay-0.089
umbled-0.089
box-0.088
Token and
Token ID290
Feature activation-6.16e-03

Pos logit contributions
eling+0.001
land+0.001
itive+0.001
 tale+0.001
 gn+0.001
Neg logit contributions
apped-0.002
enge-0.002
ay-0.002
amed-0.002
umbled-0.002
Token she
Token ID673
Feature activation-7.15e-03

Pos logit contributions
apped+0.011
ay+0.011
enge+0.011
umbled+0.011
amed+0.011
Neg logit contributions
eling-0.007
 why-0.007
 gn-0.007
land-0.007
 lav-0.007
Token was
Token ID373
Feature activation-2.89e-03

Pos logit contributions
ay+0.014
enge+0.014
apped+0.013
ane+0.013
umbled+0.013
Neg logit contributions
 gn-0.008
eling-0.008
 scr-0.008
 why-0.008
 tul-0.007
Token so
Token ID523
Feature activation-6.12e-03

Pos logit contributions
apped+0.005
enge+0.005
ay+0.005
umbled+0.005
amed+0.005
Neg logit contributions
eling-0.004
land-0.003
 tale-0.003
itive-0.003
 why-0.003
Token's
Token ID338
Feature activation+3.84e-03

Pos logit contributions
apped+0.001
enge+0.001
amed+0.001
ay+0.001
umbled+0.001
Neg logit contributions
eling-0.002
land-0.002
itive-0.002
 tale-0.002
 hesitation-0.001
Token okay
Token ID8788
Feature activation+3.06e-02

Pos logit contributions
land+0.005
itive+0.005
eling+0.005
 tale+0.005
 gn+0.004
Neg logit contributions
amed-0.007
apped-0.007
ay-0.006
enge-0.006
umbled-0.006
Token,
Token ID11
Feature activation-8.38e-03

Pos logit contributions
eling+0.039
land+0.036
 tale+0.034
 lav+0.033
itive+0.033
Neg logit contributions
apped-0.054
ay-0.052
enge-0.052
umbled-0.049
amed-0.048
Token I
Token ID314
Feature activation+1.47e-02

Pos logit contributions
apped+0.014
enge+0.013
ay+0.013
amed+0.013
umbled+0.012
Neg logit contributions
eling-0.011
land-0.011
 tale-0.011
itive-0.011
 lav-0.010
Token understand
Token ID1833
Feature activation-3.50e-03

Pos logit contributions
 tale+0.019
 why+0.019
land+0.019
eling+0.019
 seem+0.018
Neg logit contributions
apped-0.024
enge-0.024
amed-0.024
umbled-0.023
ay-0.023
Token why
Token ID1521
Feature activation+4.64e-02

Pos logit contributions
apped+0.006
enge+0.006
ay+0.006
umbled+0.005
amed+0.005
Neg logit contributions
eling-0.004
land-0.004
itive-0.004
 gn-0.004
 tale-0.004
Token you
Token ID345
Feature activation-4.30e-03

Pos logit contributions
land+0.073
eling+0.073
 tale+0.071
itive+0.070
 lav+0.069
Neg logit contributions
apped-0.065
enge-0.060
ay-0.060
umbled-0.059
amed-0.057
Token did
Token ID750
Feature activation+5.92e-03

Pos logit contributions
apped+0.008
enge+0.008
amed+0.008
ay+0.008
umbled+0.008
Neg logit contributions
eling-0.005
 seem-0.005
 why-0.005
 tale-0.005
itive-0.005
Token it
Token ID340
Feature activation-7.14e-03

Pos logit contributions
land+0.009
itive+0.009
eling+0.009
 tale+0.009
 lav+0.009
Neg logit contributions
apped-0.008
ay-0.007
amed-0.007
enge-0.007
 losers-0.007
Token.
Token ID13
Feature activation-6.09e-03

Pos logit contributions
apped+0.013
enge+0.012
ay+0.012
amed+0.012
umbled+0.012
Neg logit contributions
eling-0.009
land-0.008
 tale-0.008
itive-0.008
 lav-0.008
Token But
Token ID887
Feature activation+3.71e-03

Pos logit contributions
apped+0.009
amed+0.009
enge+0.009
umbled+0.008
ay+0.008
Neg logit contributions
eling-0.009
 why-0.009
 tale-0.009
land-0.009
 lav-0.009
Token was
Token ID373
Feature activation+4.14e-03

Pos logit contributions
enge+0.036
ay+0.035
apped+0.035
umbled+0.033
amed+0.033
Neg logit contributions
eling-0.025
 scr-0.024
 why-0.024
land-0.023
itive-0.023
Token so
Token ID523
Feature activation-4.42e-03

Pos logit contributions
eling+0.005
 why+0.004
land+0.004
 scr+0.004
 tale+0.004
Neg logit contributions
enge-0.008
apped-0.008
ay-0.008
umbled-0.008
ane-0.007
Token glad
Token ID9675
Feature activation+1.39e-02

Pos logit contributions
enge+0.008
apped+0.008
ay+0.008
umbled+0.008
amed+0.008
Neg logit contributions
eling-0.005
itive-0.005
land-0.005
 why-0.005
 tale-0.005
Token to
Token ID284
Feature activation-4.55e-03

Pos logit contributions
eling+0.024
itive+0.022
 lav+0.022
 tale+0.022
 tul+0.021
Neg logit contributions
apped-0.019
umbled-0.017
ay-0.016
enge-0.016
 of-0.015
Token understand
Token ID1833
Feature activation+7.24e-03

Pos logit contributions
apped+0.006
ay+0.006
amed+0.006
umbled+0.006
enge+0.006
Neg logit contributions
eling-0.007
land-0.007
itive-0.007
 tale-0.007
 lav-0.007
Token why
Token ID1521
Feature activation+4.66e-02

Pos logit contributions
eling+0.009
itive+0.009
 tale+0.008
 tul+0.008
land+0.008
Neg logit contributions
apped-0.012
ay-0.011
box-0.011
enge-0.011
 of-0.010
Token everything
Token ID2279
Feature activation+2.86e-03

Pos logit contributions
eling+0.063
land+0.063
 tale+0.061
 tul+0.058
 lav+0.058
Neg logit contributions
apped-0.076
ay-0.069
umbled-0.069
enge-0.068
amed-0.066
Token looked
Token ID3114
Feature activation+2.42e-02

Pos logit contributions
eling+0.004
land+0.004
 tale+0.004
itive+0.004
 why+0.004
Neg logit contributions
apped-0.004
enge-0.004
amed-0.004
ay-0.004
umbled-0.004
Token like
Token ID588
Feature activation+1.77e-02

Pos logit contributions
eling+0.032
 tale+0.030
itive+0.030
land+0.030
 tul+0.029
Neg logit contributions
apped-0.040
ay-0.038
amed-0.038
enge-0.037
umbled-0.037
Token such
Token ID884
Feature activation-2.33e-03

Pos logit contributions
eling+0.021
land+0.020
itive+0.020
 tale+0.020
 lav+0.019
Neg logit contributions
apped-0.032
enge-0.031
amed-0.030
ay-0.030
umbled-0.030
Token a
Token ID257
Feature activation-2.65e-03

Pos logit contributions
apped+0.003
amed+0.003
enge+0.003
ay+0.003
umbled+0.003
Neg logit contributions
eling-0.004
land-0.004
 tale-0.003
 tul-0.003
cel-0.003
Token the
Token ID262
Feature activation-1.33e-02

Pos logit contributions
enge+0.013
apped+0.013
umbled+0.013
ane+0.012
ay+0.012
Neg logit contributions
land-0.006
 lav-0.006
 why-0.006
 tul-0.006
 scr-0.006
Token grapes
Token ID36656
Feature activation+2.07e-02

Pos logit contributions
apped+0.025
enge+0.024
umbled+0.023
amed+0.022
ay+0.022
Neg logit contributions
land-0.016
eling-0.016
 tale-0.016
 tul-0.015
 lav-0.015
Token out
Token ID503
Feature activation-1.45e-02

Pos logit contributions
land+0.032
eling+0.031
 gn+0.030
 scr+0.030
 tale+0.030
Neg logit contributions
apped-0.029
enge-0.029
umbled-0.027
ay-0.027
amed-0.026
Token on
Token ID319
Feature activation+1.05e-02

Pos logit contributions
apped+0.014
enge+0.014
amed+0.013
umbled+0.013
ay+0.012
Neg logit contributions
eling-0.029
land-0.029
 tale-0.028
 why-0.028
itive-0.028
Token her
Token ID607
Feature activation+2.30e-02

Pos logit contributions
eling+0.014
land+0.013
itive+0.013
 tale+0.013
 lav+0.012
Neg logit contributions
apped-0.018
enge-0.017
ay-0.017
amed-0.017
umbled-0.016
Token kitchen
Token ID9592
Feature activation+5.10e-02

Pos logit contributions
land+0.033
eling+0.032
 tale+0.032
 scr+0.031
itive+0.031
Neg logit contributions
apped-0.037
enge-0.035
umbled-0.034
amed-0.034
ay-0.034
Token table
Token ID3084
Feature activation+1.55e-02

Pos logit contributions
eling+0.069
itive+0.065
 tale+0.064
land+0.064
 lav+0.062
Neg logit contributions
apped-0.085
enge-0.081
ay-0.080
amed-0.079
umbled-0.078
Token,
Token ID11
Feature activation-2.65e-03

Pos logit contributions
land+0.012
eling+0.012
itive+0.011
 tale+0.011
 gn+0.011
Neg logit contributions
apped-0.034
enge-0.033
amed-0.032
umbled-0.032
ay-0.032
Token so
Token ID523
Feature activation-1.07e-02

Pos logit contributions
apped+0.005
enge+0.005
amed+0.005
ay+0.005
umbled+0.005
Neg logit contributions
eling-0.003
land-0.003
itive-0.003
 tale-0.003
 lav-0.003
Token she
Token ID673
Feature activation-2.32e-02

Pos logit contributions
apped+0.024
enge+0.024
amed+0.024
ay+0.023
umbled+0.023
Neg logit contributions
eling-0.008
land-0.007
itive-0.007
 why-0.007
 tale-0.007
Token could
Token ID714
Feature activation-1.20e-02

Pos logit contributions
ay+0.044
enge+0.043
apped+0.042
umbled+0.042
amed+0.040
Neg logit contributions
eling-0.027
 scr-0.027
 lav-0.025
 why-0.025
 gn-0.025
Token her
Token ID607
Feature activation-6.11e-03

Pos logit contributions
eling+0.008
land+0.007
itive+0.007
 tale+0.007
 lav+0.007
Neg logit contributions
apped-0.012
enge-0.011
amed-0.011
ay-0.011
umbled-0.011
Token mom
Token ID1995
Feature activation-1.92e-02

Pos logit contributions
enge+0.011
umbled+0.011
apped+0.010
amed+0.010
ase+0.010
Neg logit contributions
 scr-0.007
 why-0.007
itive-0.007
 gn-0.007
 sweet-0.007
Token said
Token ID531
Feature activation-8.77e-03

Pos logit contributions
apped+0.033
enge+0.032
ay+0.031
amed+0.031
umbled+0.031
Neg logit contributions
eling-0.025
land-0.024
itive-0.024
 tale-0.023
 lav-0.023
Token.
Token ID13
Feature activation-1.26e-03

Pos logit contributions
apped+0.017
amed+0.014
ay+0.014
umbled+0.014
enge+0.014
Neg logit contributions
eling-0.012
 tale-0.011
land-0.011
 tul-0.010
 lav-0.010
Token "
Token ID366
Feature activation+1.61e-02

Pos logit contributions
apped+0.002
ay+0.002
enge+0.002
amed+0.002
umbled+0.002
Neg logit contributions
eling-0.002
itive-0.002
 tale-0.002
land-0.002
 hesitation-0.002
TokenWhat
Token ID2061
Feature activation+4.99e-02

Pos logit contributions
eling+0.020
land+0.020
itive+0.019
 tale+0.019
 lav+0.018
Neg logit contributions
apped-0.028
enge-0.027
amed-0.027
ay-0.027
umbled-0.026
Token kind
Token ID1611
Feature activation-1.32e-02

Pos logit contributions
land+0.055
 why+0.055
 tale+0.055
itive+0.054
eling+0.053
Neg logit contributions
apped-0.095
amed-0.094
umbled-0.092
enge-0.089
ay-0.087
Token of
Token ID286
Feature activation-1.38e-02

Pos logit contributions
amed+0.016
apped+0.015
umbled+0.015
ushes+0.013
enge+0.013
Neg logit contributions
 why-0.025
 tale-0.025
land-0.024
 lav-0.023
 while-0.023
Token story
Token ID1621
Feature activation+2.71e-02

Pos logit contributions
apped+0.024
enge+0.023
amed+0.023
ay+0.023
umbled+0.023
Neg logit contributions
eling-0.017
land-0.017
 tale-0.016
itive-0.016
 lav-0.016
Token do
Token ID466
Feature activation+4.45e-03

Pos logit contributions
eling+0.040
itive+0.039
land+0.038
 tul+0.037
 tale+0.037
Neg logit contributions
apped-0.042
ay-0.041
enge-0.040
amed-0.039
umbled-0.037
Token you
Token ID345
Feature activation+3.84e-03

Pos logit contributions
eling+0.006
land+0.006
 tale+0.006
itive+0.006
 why+0.006
Neg logit contributions
apped-0.007
amed-0.007
enge-0.007
umbled-0.007
ay-0.007
Token play
Token ID711
Feature activation-4.80e-03

Pos logit contributions
land+0.032
eling+0.031
itive+0.031
 lav+0.031
 tale+0.030
Neg logit contributions
apped-0.022
ay-0.021
enge-0.021
 losers-0.020
room-0.019
Token with
Token ID351
Feature activation-2.08e-04

Pos logit contributions
apped+0.008
enge+0.007
umbled+0.007
amed+0.007
ay+0.007
Neg logit contributions
eling-0.007
land-0.006
 tale-0.006
itive-0.006
 why-0.006
Token him
Token ID683
Feature activation+1.93e-02

Pos logit contributions
apped+0.000
enge+0.000
amed+0.000
ay+0.000
umbled+0.000
Neg logit contributions
eling-0.000
land-0.000
itive-0.000
 tale-0.000
 lav-0.000
Token.
Token ID13
Feature activation+7.20e-03

Pos logit contributions
eling+0.017
land+0.016
itive+0.016
 tale+0.016
 lav+0.015
Neg logit contributions
apped-0.041
enge-0.040
amed-0.039
ay-0.039
umbled-0.039
Token Max
Token ID5436
Feature activation+8.73e-03

Pos logit contributions
eling+0.011
land+0.010
itive+0.010
 tale+0.010
 hesitation+0.010
Neg logit contributions
apped-0.011
enge-0.011
ay-0.011
umbled-0.010
amed-0.010
Token was
Token ID373
Feature activation+7.96e-04

Pos logit contributions
eling+0.013
land+0.012
itive+0.012
 tale+0.012
 lav+0.012
Neg logit contributions
apped-0.014
enge-0.013
ay-0.013
amed-0.013
umbled-0.013
Token sad
Token ID6507
Feature activation+2.94e-02

Pos logit contributions
eling+0.001
 why+0.001
 tale+0.001
 scr+0.001
 gn+0.001
Neg logit contributions
apped-0.002
enge-0.001
ay-0.001
umbled-0.001
ane-0.001
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
eling+0.051
 tul+0.047
land+0.046
 lav+0.046
itive+0.046
Neg logit contributions
apped-0.042
ay-0.040
umbled-0.038
 of-0.036
enge-0.036
Token but
Token ID475
Feature activation+1.51e-02

Pos logit contributions
eling+0.008
land+0.008
 tale+0.007
itive+0.007
 lav+0.007
Neg logit contributions
apped-0.023
enge-0.022
amed-0.021
ay-0.021
umbled-0.021
Token he
Token ID339
Feature activation+7.89e-05

Pos logit contributions
eling+0.015
land+0.014
itive+0.014
 tale+0.014
 why+0.013
Neg logit contributions
apped-0.031
enge-0.030
ay-0.029
amed-0.029
umbled-0.029
Token had
Token ID550
Feature activation-1.04e-02

Pos logit contributions
 why+0.000
 scr+0.000
 gn+0.000
eling+0.000
 spit+0.000
Neg logit contributions
ay-0.000
apped-0.000
ane-0.000
umbled-0.000
anes-0.000
Token
Token ID198
Feature activation-3.10e-03

Pos logit contributions
apped+0.005
enge+0.005
ay+0.005
umbled+0.004
amed+0.004
Neg logit contributions
eling-0.005
land-0.005
itive-0.005
 tale-0.005
 hesitation-0.004
Token
Token ID198
Feature activation+1.79e-03

Pos logit contributions
apped+0.006
enge+0.006
amed+0.006
ay+0.006
umbled+0.006
Neg logit contributions
eling-0.003
land-0.003
itive-0.003
 tale-0.003
 why-0.003
TokenS
Token ID50
Feature activation+2.20e-02

Pos logit contributions
eling+0.002
land+0.002
itive+0.002
 tale+0.002
 lav+0.002
Neg logit contributions
apped-0.003
enge-0.003
ay-0.003
amed-0.003
umbled-0.003
Tokenara
Token ID3301
Feature activation-5.13e-03

Pos logit contributions
eling+0.029
land+0.028
itive+0.027
 tale+0.027
 lav+0.026
Neg logit contributions
apped-0.038
enge-0.036
ay-0.036
amed-0.035
umbled-0.035
Token was
Token ID373
Feature activation+2.52e-03

Pos logit contributions
enge+0.010
ay+0.010
apped+0.010
ane+0.009
amer+0.009
Neg logit contributions
eling-0.006
 scr-0.005
 gn-0.005
 why-0.005
itive-0.005
Token scared
Token ID12008
Feature activation+3.02e-03

Pos logit contributions
eling+0.003
 why+0.003
 scr+0.003
 gn+0.002
 wishing+0.002
Neg logit contributions
enge-0.005
ay-0.005
apped-0.005
umbled-0.005
ane-0.005
Token and
Token ID290
Feature activation+1.63e-02

Pos logit contributions
eling+0.005
 lav+0.005
 gn+0.005
 tul+0.005
 tale+0.005
Neg logit contributions
ay-0.004
apped-0.004
 of-0.004
umbled-0.004
amed-0.003
Token hurt
Token ID5938
Feature activation-7.91e-03

Pos logit contributions
eling+0.021
 why+0.019
 scr+0.019
 gn+0.018
itive+0.018
Neg logit contributions
ay-0.030
enge-0.029
apped-0.029
amed-0.028
ase-0.028
Token.
Token ID13
Feature activation+6.49e-03

Pos logit contributions
apped+0.015
umbled+0.015
amed+0.014
enge+0.014
ay+0.014
Neg logit contributions
 tale-0.010
eling-0.009
land-0.009
 lav-0.009
 tul-0.008
Token She
Token ID1375
Feature activation+1.46e-02

Pos logit contributions
eling+0.011
land+0.011
 tale+0.011
itive+0.011
 hesitation+0.010
Neg logit contributions
apped-0.009
enge-0.008
umbled-0.008
ay-0.008
amed-0.008
Token cried
Token ID16896
Feature activation-3.00e-02

Pos logit contributions
eling+0.019
 scr+0.018
 why+0.017
 gn+0.017
 lav+0.016
Neg logit contributions
apped-0.027
ay-0.027
enge-0.026
ane-0.025
amed-0.025
Token and
Token ID290
Feature activation-9.53e-03

Pos logit contributions
apped+0.013
enge+0.013
umbled+0.013
amed+0.013
ay+0.012
Neg logit contributions
 tale-0.008
land-0.008
 gn-0.007
itive-0.007
 why-0.007
Token put
Token ID1234
Feature activation+5.15e-03

Pos logit contributions
ay+0.016
apped+0.016
enge+0.016
umbled+0.016
amed+0.015
Neg logit contributions
eling-0.012
 why-0.011
 scr-0.011
itive-0.011
 gn-0.011
Token it
Token ID340
Feature activation-7.33e-03

Pos logit contributions
eling+0.005
 why+0.005
 what+0.005
 lav+0.005
 hesitation+0.005
Neg logit contributions
apped-0.010
umbled-0.010
enge-0.010
amed-0.009
ane-0.009
Token back
Token ID736
Feature activation-2.48e-02

Pos logit contributions
enge+0.011
apped+0.011
umbled+0.010
ay+0.010
amer+0.009
Neg logit contributions
eling-0.011
land-0.010
 why-0.010
 tale-0.010
itive-0.010
Token in
Token ID287
Feature activation+1.59e-02

Pos logit contributions
enge+0.036
apped+0.035
ay+0.035
umbled+0.033
amed+0.033
Neg logit contributions
land-0.037
eling-0.036
 why-0.035
itive-0.035
 tale-0.035
Token the
Token ID262
Feature activation+1.70e-02

Pos logit contributions
eling+0.017
 tale+0.016
land+0.016
 lav+0.016
itive+0.016
Neg logit contributions
apped-0.031
enge-0.030
umbled-0.029
amed-0.029
ay-0.028
Token drawer
Token ID33451
Feature activation-1.68e-03

Pos logit contributions
eling+0.029
land+0.028
 tale+0.028
 lav+0.027
itive+0.027
Neg logit contributions
apped-0.022
enge-0.021
umbled-0.020
amed-0.020
ay-0.020
Token.
Token ID13
Feature activation-5.82e-03

Pos logit contributions
apped+0.004
enge+0.004
ay+0.004
amed+0.004
umbled+0.004
Neg logit contributions
land-0.001
eling-0.001
itive-0.001
 tale-0.001
 gn-0.001
Token
Token ID220
Feature activation-9.68e-03

Pos logit contributions
enge+0.011
apped+0.011
amed+0.011
umbled+0.011
ay+0.010
Neg logit contributions
 why-0.006
land-0.006
eling-0.005
 tale-0.005
 seem-0.005
Token
Token ID198
Feature activation+9.73e-03

Pos logit contributions
apped+0.019
enge+0.018
amed+0.018
ay+0.018
umbled+0.018
Neg logit contributions
eling-0.010
land-0.010
 tale-0.009
itive-0.009
 why-0.009
Token
Token ID198
Feature activation+1.58e-02

Pos logit contributions
land+0.009
eling+0.009
 tale+0.008
 why+0.008
 gn+0.008
Neg logit contributions
apped-0.020
enge-0.019
umbled-0.019
amed-0.019
ay-0.018
Token home
Token ID1363
Feature activation-2.70e-02

Pos logit contributions
apped+0.031
enge+0.030
amed+0.029
ay+0.029
umbled+0.028
Neg logit contributions
eling-0.023
land-0.022
itive-0.021
 tale-0.021
 why-0.021
Token.
Token ID13
Feature activation-3.90e-03

Pos logit contributions
apped+0.051
amed+0.046
umbled+0.046
enge+0.046
ay+0.046
Neg logit contributions
eling-0.030
 lav-0.029
 tale-0.029
itive-0.028
 tul-0.027
Token And
Token ID843
Feature activation+5.31e-03

Pos logit contributions
apped+0.006
enge+0.006
amed+0.006
ay+0.006
umbled+0.006
Neg logit contributions
land-0.005
eling-0.005
 gn-0.005
 why-0.005
 tale-0.005
Token that
Token ID326
Feature activation-8.08e-03

Pos logit contributions
 why+0.005
eling+0.005
land+0.005
 tale+0.005
 gn+0.005
Neg logit contributions
apped-0.010
enge-0.010
ay-0.010
umbled-0.010
amed-0.010
Token's
Token ID338
Feature activation+3.47e-03

Pos logit contributions
apped+0.012
enge+0.012
ay+0.012
amed+0.012
umbled+0.011
Neg logit contributions
eling-0.012
land-0.012
itive-0.011
 tale-0.011
 lav-0.011
Token how
Token ID703
Feature activation+3.24e-02

Pos logit contributions
land+0.003
eling+0.003
itive+0.003
 tale+0.003
 seem+0.003
Neg logit contributions
apped-0.007
amed-0.007
enge-0.007
ay-0.007
umbled-0.006
Token Will
Token ID2561
Feature activation-1.54e-02

Pos logit contributions
eling+0.035
 why+0.033
itive+0.032
land+0.032
 gn+0.031
Neg logit contributions
apped-0.063
enge-0.061
amed-0.061
ay-0.060
umbled-0.059
Tokeny
Token ID88
Feature activation-2.32e-02

Pos logit contributions
apped+0.023
enge+0.023
amed+0.022
ay+0.022
umbled+0.021
Neg logit contributions
eling-0.024
land-0.023
itive-0.022
 tale-0.022
 why-0.022
Token found
Token ID1043
Feature activation-2.04e-02

Pos logit contributions
apped+0.040
enge+0.038
amed+0.038
ay+0.037
umbled+0.037
Neg logit contributions
eling-0.032
land-0.030
 tale-0.029
itive-0.029
 why-0.029
Token his
Token ID465
Feature activation-1.41e-02

Pos logit contributions
apped+0.041
enge+0.039
umbled+0.039
amed+0.038
ay+0.038
Neg logit contributions
eling-0.021
land-0.021
 why-0.020
 tale-0.020
 lav-0.019
Token ancient
Token ID6156
Feature activation-1.10e-02

Pos logit contributions
apped+0.024
enge+0.023
amed+0.022
ay+0.022
umbled+0.022
Neg logit contributions
eling-0.019
land-0.019
 tale-0.018
itive-0.018
 lav-0.018
Token playful
Token ID34264
Feature activation-1.42e-02

Pos logit contributions
apped+0.008
enge+0.007
amed+0.007
umbled+0.007
ay+0.007
Neg logit contributions
eling-0.006
land-0.006
 tale-0.006
itive-0.006
 gn-0.006
Token pup
Token ID15552
Feature activation-1.30e-02

Pos logit contributions
apped+0.018
amed+0.017
enge+0.017
umbled+0.017
ay+0.016
Neg logit contributions
land-0.024
 tale-0.024
itive-0.024
eling-0.024
 gn-0.023
Token.
Token ID13
Feature activation-1.57e-03

Pos logit contributions
apped+0.030
amed+0.029
enge+0.029
umbled+0.029
ay+0.029
Neg logit contributions
 tale-0.009
land-0.009
eling-0.009
itive-0.008
 why-0.008
Token He
Token ID679
Feature activation+5.62e-03

Pos logit contributions
apped+0.002
enge+0.002
amed+0.002
umbled+0.002
ay+0.002
Neg logit contributions
eling-0.002
land-0.002
itive-0.002
 tale-0.002
 why-0.002
Token likes
Token ID7832
Feature activation+9.22e-03

Pos logit contributions
eling+0.008
 tale+0.008
land+0.008
itive+0.008
 why+0.007
Neg logit contributions
apped-0.009
enge-0.009
amed-0.009
ay-0.009
umbled-0.008
Token to
Token ID284
Feature activation+1.36e-02

Pos logit contributions
 why+0.014
land+0.013
eling+0.013
 tale+0.013
 lav+0.013
Neg logit contributions
apped-0.015
amed-0.014
enge-0.014
umbled-0.014
ay-0.013
Token chase
Token ID15505
Feature activation+8.85e-03

Pos logit contributions
eling+0.018
itive+0.017
land+0.017
 tale+0.016
 lav+0.016
Neg logit contributions
apped-0.022
enge-0.021
ay-0.021
amed-0.020
umbled-0.020
Token things
Token ID1243
Feature activation+3.90e-03

Pos logit contributions
eling+0.010
 why+0.010
land+0.010
 tale+0.010
itive+0.010
Neg logit contributions
apped-0.017
amed-0.016
enge-0.015
umbled-0.015
ane-0.014
Token.
Token ID13
Feature activation-3.18e-03

Pos logit contributions
eling+0.004
land+0.004
 tale+0.004
itive+0.003
 why+0.003
Neg logit contributions
apped-0.008
enge-0.008
amed-0.008
umbled-0.008
ay-0.008
Token Are
Token ID4231
Feature activation+3.86e-02

Pos logit contributions
apped+0.005
enge+0.004
amed+0.004
umbled+0.004
ay+0.004
Neg logit contributions
eling-0.005
land-0.005
 tale-0.005
itive-0.005
 why-0.005
Token you
Token ID345
Feature activation+3.62e-03

Pos logit contributions
land+0.067
 tale+0.064
eling+0.064
itive+0.064
 lav+0.063
Neg logit contributions
apped-0.049
enge-0.046
ay-0.044
amed-0.043
umbled-0.043
Token We
Token ID775
Feature activation+4.80e-03

Pos logit contributions
amed+0.016
umbled+0.016
enge+0.016
apped+0.016
ied+0.015
Neg logit contributions
 why-0.014
land-0.014
mmm-0.013
 tale-0.013
 gn-0.013
Token restored
Token ID15032
Feature activation-1.92e-02

Pos logit contributions
eling+0.007
itive+0.006
 tale+0.006
 seem+0.006
 why+0.006
Neg logit contributions
apped-0.008
umbled-0.008
enge-0.008
ay-0.007
amed-0.007
Token the
Token ID262
Feature activation-1.96e-02

Pos logit contributions
apped+0.036
umbled+0.034
amed+0.034
enge+0.034
ase+0.032
Neg logit contributions
 why-0.023
eling-0.023
land-0.022
 what-0.022
 tale-0.021
Token old
Token ID1468
Feature activation+2.56e-03

Pos logit contributions
apped+0.036
umbled+0.034
enge+0.034
amed+0.034
ay+0.032
Neg logit contributions
 tale-0.024
eling-0.024
land-0.024
 lav-0.023
 tul-0.023
Token house
Token ID2156
Feature activation-1.06e-02

Pos logit contributions
 tale+0.003
land+0.003
 gn+0.003
itive+0.003
 tul+0.003
Neg logit contributions
apped-0.004
amed-0.004
enge-0.004
umbled-0.004
ay-0.004
Token!"
Token ID2474
Feature activation-5.31e-03

Pos logit contributions
apped+0.020
enge+0.020
umbled+0.019
amed+0.019
ay+0.018
Neg logit contributions
land-0.012
eling-0.012
 tale-0.011
 why-0.011
itive-0.011
Token They
Token ID1119
Feature activation-1.18e-02

Pos logit contributions
enge+0.010
amed+0.010
apped+0.010
umbled+0.010
ay+0.009
Neg logit contributions
 why-0.006
eling-0.006
 tale-0.006
 gn-0.005
land-0.005
Token hugged
Token ID41130
Feature activation-2.93e-02

Pos logit contributions
enge+0.022
umbled+0.021
ay+0.021
ase+0.020
apped+0.020
Neg logit contributions
 scr-0.013
eling-0.013
 gn-0.013
itive-0.013
 rep-0.013
Token and
Token ID290
Feature activation-1.55e-03

Pos logit contributions
apped+0.045
enge+0.042
ay+0.041
amed+0.041
umbled+0.040
Neg logit contributions
eling-0.045
land-0.043
 tale-0.043
itive-0.042
 lav-0.041
Token laughed
Token ID13818
Feature activation+1.98e-04

Pos logit contributions
apped+0.002
enge+0.002
ay+0.002
umbled+0.002
amed+0.002
Neg logit contributions
eling-0.002
land-0.002
itive-0.002
 tale-0.002
 lav-0.002
Token together
Token ID1978
Feature activation-4.10e-03

Pos logit contributions
eling+0.000
 tale+0.000
land+0.000
 lav+0.000
itive+0.000
Neg logit contributions
apped-0.000
amed-0.000
ay-0.000
umbled-0.000
enge-0.000
Token you
Token ID345
Feature activation-9.18e-03

Pos logit contributions
apped+0.007
enge+0.007
ay+0.006
amed+0.006
umbled+0.006
Neg logit contributions
eling-0.008
land-0.008
itive-0.008
 tale-0.008
 gn-0.007
Token want
Token ID765
Feature activation-1.10e-02

Pos logit contributions
apped+0.019
enge+0.018
ay+0.017
amed+0.017
umbled+0.017
Neg logit contributions
eling-0.009
land-0.009
itive-0.009
 tale-0.009
 why-0.008
Token to
Token ID284
Feature activation+5.29e-03

Pos logit contributions
apped+0.021
umbled+0.020
amed+0.020
enge+0.019
amer+0.019
Neg logit contributions
 why-0.015
land-0.013
 what-0.013
 tale-0.013
 lav-0.013
Token do
Token ID466
Feature activation+6.02e-04

Pos logit contributions
eling+0.007
land+0.007
 tale+0.007
itive+0.007
 lav+0.006
Neg logit contributions
apped-0.009
ay-0.008
enge-0.008
umbled-0.008
amed-0.008
Token with
Token ID351
Feature activation+4.15e-03

Pos logit contributions
eling+0.001
land+0.001
 tale+0.001
itive+0.001
 why+0.001
Neg logit contributions
apped-0.001
enge-0.001
umbled-0.001
amed-0.001
ay-0.001
Token it
Token ID340
Feature activation-4.32e-03

Pos logit contributions
eling+0.006
land+0.005
 tale+0.005
itive+0.005
 lav+0.005
Neg logit contributions
apped-0.007
enge-0.007
amed-0.006
umbled-0.006
ay-0.006
Token?".
Token ID43634
Feature activation+3.36e-03

Pos logit contributions
apped+0.008
enge+0.007
amed+0.007
umbled+0.007
ay+0.007
Neg logit contributions
eling-0.005
land-0.005
 tale-0.005
itive-0.005
 why-0.005
Token
Token ID198
Feature activation+9.78e-03

Pos logit contributions
eling+0.005
itive+0.005
 lav+0.005
 tale+0.005
 tul+0.005
Neg logit contributions
apped-0.005
ay-0.005
enge-0.005
amed-0.004
umbled-0.004
Token
Token ID198
Feature activation+1.81e-02

Pos logit contributions
eling+0.012
itive+0.012
land+0.011
 tale+0.011
 lav+0.011
Neg logit contributions
apped-0.017
enge-0.017
ay-0.016
amed-0.016
umbled-0.016
TokenAlice
Token ID44484
Feature activation-1.40e-02

Pos logit contributions
eling+0.026
itive+0.025
land+0.024
 tale+0.024
 lav+0.024
Neg logit contributions
apped-0.029
enge-0.028
amed-0.028
ay-0.027
umbled-0.027
Token grinned
Token ID36268
Feature activation+1.71e-03

Pos logit contributions
apped+0.022
enge+0.021
amed+0.021
umbled+0.020
ay+0.020
Neg logit contributions
eling-0.020
land-0.020
itive-0.019
 tale-0.019
 lav-0.019
Token and
Token ID290
Feature activation+1.32e-02

Pos logit contributions
eling+0.027
land+0.026
itive+0.026
 tale+0.026
 lav+0.024
Neg logit contributions
apped-0.054
enge-0.052
amed-0.051
ay-0.051
umbled-0.050
Token licking
Token ID43390
Feature activation-5.50e-03

Pos logit contributions
land+0.018
 tale+0.017
itive+0.017
eling+0.017
mmm+0.016
Neg logit contributions
apped-0.023
amed-0.021
umbled-0.020
enge-0.020
ay-0.019
Token his
Token ID465
Feature activation+1.52e-02

Pos logit contributions
apped+0.009
enge+0.009
ay+0.008
amed+0.008
umbled+0.008
Neg logit contributions
eling-0.008
land-0.007
itive-0.007
 tale-0.007
 lav-0.007
Token leg
Token ID1232
Feature activation+2.43e-02

Pos logit contributions
eling+0.020
 tale+0.020
 scr+0.020
land+0.020
itive+0.019
Neg logit contributions
apped-0.025
enge-0.024
amed-0.024
umbled-0.023
ay-0.023
Token.
Token ID13
Feature activation-3.90e-03

Pos logit contributions
eling+0.030
 lav+0.027
 tul+0.027
itive+0.027
land+0.026
Neg logit contributions
apped-0.043
ay-0.042
umbled-0.041
amed-0.040
enge-0.039
Token
Token ID198
Feature activation-2.49e-03

Pos logit contributions
apped+0.007
enge+0.007
amed+0.006
ay+0.006
umbled+0.006
Neg logit contributions
eling-0.005
land-0.005
 tale-0.005
itive-0.005
 lav-0.005
Token
Token ID198
Feature activation+3.28e-03

Pos logit contributions
apped+0.005
enge+0.004
ay+0.004
amed+0.004
umbled+0.004
Neg logit contributions
eling-0.003
land-0.003
itive-0.003
 tale-0.003
 lav-0.003
Token"
Token ID1
Feature activation-3.22e-03

Pos logit contributions
eling+0.004
land+0.004
 tale+0.004
itive+0.004
 lav+0.004
Neg logit contributions
apped-0.006
enge-0.006
umbled-0.006
ay-0.006
amed-0.006
TokenRex
Token ID47389
Feature activation+9.49e-03

Pos logit contributions
apped+0.005
ay+0.004
umbled+0.004
enge+0.004
amed+0.004
Neg logit contributions
eling-0.005
 gn-0.005
itive-0.005
 tale-0.005
land-0.005
Token,
Token ID11
Feature activation+1.80e-03

Pos logit contributions
eling+0.009
land+0.009
itive+0.009
 tale+0.008
 lav+0.008
Neg logit contributions
apped-0.019
enge-0.019
ay-0.018
amed-0.018
umbled-0.018
Token are
Token ID389
Feature activation-7.58e-05

Pos logit contributions
eling+0.003
 why+0.002
land+0.002
 tale+0.002
itive+0.002
Neg logit contributions
apped-0.003
umbled-0.003
enge-0.003
amed-0.003
ay-0.003
Token"
Token ID1
Feature activation-5.35e-03

Pos logit contributions
eling+0.015
land+0.015
 tale+0.014
itive+0.014
 why+0.014
Neg logit contributions
apped-0.029
enge-0.027
umbled-0.026
ay-0.026
amed-0.026
TokenHey
Token ID10814
Feature activation+2.06e-02

Pos logit contributions
apped+0.010
enge+0.009
ay+0.009
umbled+0.009
amed+0.009
Neg logit contributions
eling-0.007
land-0.006
itive-0.006
 tale-0.006
 lav-0.006
Token!
Token ID0
Feature activation-8.01e-03

Pos logit contributions
eling+0.025
land+0.023
 lav+0.022
itive+0.022
 tale+0.021
Neg logit contributions
apped-0.038
ay-0.037
enge-0.037
amed-0.035
umbled-0.034
Token Come
Token ID7911
Feature activation+8.77e-03

Pos logit contributions
apped+0.013
amed+0.012
enge+0.012
umbled+0.012
ay+0.012
Neg logit contributions
land-0.011
eling-0.010
 tale-0.010
 why-0.010
 gn-0.010
Token back
Token ID736
Feature activation-3.66e-02

Pos logit contributions
 tale+0.014
land+0.013
eling+0.013
 why+0.013
 gn+0.013
Neg logit contributions
apped-0.012
umbled-0.012
enge-0.012
amed-0.012
ay-0.011
Token here
Token ID994
Feature activation-6.78e-03

Pos logit contributions
apped+0.071
umbled+0.068
enge+0.067
amed+0.066
ay+0.064
Neg logit contributions
land-0.043
 why-0.040
 tale-0.039
 gn-0.038
eling-0.038
Token!"
Token ID2474
Feature activation-3.68e-03

Pos logit contributions
apped+0.015
umbled+0.014
enge+0.014
amed+0.014
ay+0.014
Neg logit contributions
land-0.006
 why-0.006
 tale-0.006
eling-0.005
 hesitation-0.005
Token Sarah
Token ID10490
Feature activation-1.40e-02

Pos logit contributions
apped+0.006
enge+0.006
umbled+0.006
amed+0.006
ay+0.006
Neg logit contributions
eling-0.004
 why-0.004
 tale-0.004
land-0.004
 seem-0.004
Token yelled
Token ID22760
Feature activation-2.32e-02

Pos logit contributions
apped+0.023
ay+0.022
enge+0.022
umbled+0.021
amed+0.021
Neg logit contributions
eling-0.019
land-0.019
itive-0.018
 tale-0.018
 gn-0.018
Token.
Token ID13
Feature activation-1.13e-02

Pos logit contributions
apped+0.046
enge+0.043
amed+0.042
ay+0.042
umbled+0.042
Neg logit contributions
eling-0.026
land-0.023
 tale-0.023
itive-0.022
 lav-0.022
Token
Token ID198
Feature activation+1.78e-03

Pos logit contributions
apped+0.022
enge+0.021
amed+0.021
ay+0.020
umbled+0.020
Neg logit contributions
land-0.012
eling-0.012
 tale-0.011
itive-0.011
 why-0.011
Token hospital
Token ID4436
Feature activation+2.64e-03

Pos logit contributions
apped+0.011
enge+0.010
ay+0.010
amed+0.010
umbled+0.010
Neg logit contributions
eling-0.007
land-0.007
 tale-0.007
itive-0.007
 lav-0.006
Token.
Token ID13
Feature activation-2.87e-03

Pos logit contributions
eling+0.003
 lav+0.003
itive+0.003
 tul+0.003
 tale+0.003
Neg logit contributions
apped-0.005
amed-0.004
enge-0.004
ay-0.004
umbled-0.004
Token The
Token ID383
Feature activation+1.14e-02

Pos logit contributions
apped+0.005
enge+0.005
amed+0.005
ay+0.005
umbled+0.005
Neg logit contributions
eling-0.004
land-0.003
 tale-0.003
itive-0.003
 lav-0.003
Token little
Token ID1310
Feature activation+1.91e-02

Pos logit contributions
eling+0.013
land+0.012
itive+0.012
 tale+0.012
 lav+0.011
Neg logit contributions
apped-0.022
enge-0.021
amed-0.021
ay-0.021
umbled-0.021
Token girl
Token ID2576
Feature activation-4.92e-04

Pos logit contributions
eling+0.023
itive+0.023
land+0.023
 tale+0.022
 lav+0.022
Neg logit contributions
apped-0.034
enge-0.033
amed-0.032
ay-0.032
umbled-0.032
Token had
Token ID550
Feature activation-7.14e-03

Pos logit contributions
enge+0.001
ay+0.001
apped+0.001
umbled+0.001
amed+0.001
Neg logit contributions
eling-0.001
 tale-0.001
land-0.001
 why-0.001
 gn-0.001
Token to
Token ID284
Feature activation+2.03e-02

Pos logit contributions
apped+0.013
ay+0.013
enge+0.013
amer+0.012
ane+0.012
Neg logit contributions
 why-0.009
 hesitation-0.009
land-0.008
 wishing-0.008
 tale-0.008
Token stay
Token ID2652
Feature activation+1.82e-02

Pos logit contributions
eling+0.030
land+0.030
itive+0.030
 lav+0.028
 tale+0.028
Neg logit contributions
apped-0.029
enge-0.028
ay-0.027
amed-0.027
umbled-0.025
Token in
Token ID287
Feature activation+1.39e-02

Pos logit contributions
eling+0.030
land+0.029
 tale+0.029
itive+0.028
 why+0.028
Neg logit contributions
apped-0.025
enge-0.024
ay-0.024
umbled-0.023
amed-0.023
Token the
Token ID262
Feature activation+9.84e-03

Pos logit contributions
eling+0.020
itive+0.019
land+0.019
 tale+0.018
 scr+0.018
Neg logit contributions
apped-0.021
ay-0.021
enge-0.020
amed-0.020
umbled-0.019
Token hospital
Token ID4436
Feature activation+9.95e-03

Pos logit contributions
eling+0.013
itive+0.013
land+0.013
 tale+0.013
 lav+0.012
Neg logit contributions
apped-0.016
enge-0.015
amed-0.015
ay-0.015
umbled-0.015
Token could
Token ID714
Feature activation-7.01e-03

Pos logit contributions
apped+0.020
enge+0.019
amed+0.019
umbled+0.018
ay+0.018
Neg logit contributions
land-0.015
 seem-0.015
 gn-0.015
 tale-0.014
eling-0.014
Token think
Token ID892
Feature activation+1.00e-02

Pos logit contributions
apped+0.012
enge+0.012
ay+0.011
amed+0.011
umbled+0.011
Neg logit contributions
eling-0.009
land-0.009
itive-0.008
 tale-0.008
 lav-0.008
Token of
Token ID286
Feature activation-1.18e-02

Pos logit contributions
eling+0.015
land+0.014
itive+0.014
 tale+0.014
 lav+0.014
Neg logit contributions
apped-0.015
enge-0.015
ay-0.014
amed-0.014
umbled-0.014
Token.
Token ID13
Feature activation-1.63e-03

Pos logit contributions
apped+0.018
enge+0.017
ay+0.016
amed+0.016
umbled+0.016
Neg logit contributions
eling-0.016
itive-0.016
land-0.016
 tale-0.015
 lav-0.015
Token Everywhere
Token ID48201
Feature activation+8.31e-03

Pos logit contributions
apped+0.003
enge+0.003
amed+0.003
ay+0.003
umbled+0.003
Neg logit contributions
land-0.002
eling-0.002
 tale-0.002
 why-0.002
itive-0.002
Token she
Token ID673
Feature activation-4.18e-02

Pos logit contributions
eling+0.004
 why+0.003
 tale+0.003
land+0.003
itive+0.003
Neg logit contributions
apped-0.022
amed-0.021
enge-0.021
umbled-0.021
ay-0.021
Token went
Token ID1816
Feature activation-1.64e-02

Pos logit contributions
apped+0.068
enge+0.065
ay+0.064
amed+0.064
umbled+0.063
Neg logit contributions
eling-0.058
land-0.056
 tale-0.055
itive-0.054
 gn-0.054
Token,
Token ID11
Feature activation-8.79e-03

Pos logit contributions
apped+0.038
enge+0.036
amed+0.036
ay+0.036
umbled+0.036
Neg logit contributions
eling-0.012
land-0.011
 tale-0.011
itive-0.011
 why-0.010
Token she
Token ID673
Feature activation-2.62e-02

Pos logit contributions
apped+0.018
enge+0.018
ay+0.017
amed+0.017
umbled+0.017
Neg logit contributions
eling-0.008
land-0.008
 tale-0.008
itive-0.008
 lav-0.007
Token brought
Token ID3181
Feature activation-1.19e-02

Pos logit contributions
ay+0.047
apped+0.046
enge+0.045
umbled+0.044
amed+0.043
Neg logit contributions
eling-0.035
 gn-0.033
land-0.032
 why-0.031
 tale-0.030
Token a
Token ID257
Feature activation-1.24e-02

Pos logit contributions
apped+0.024
enge+0.023
umbled+0.023
ay+0.022
amed+0.022
Neg logit contributions
land-0.012
 tale-0.012
eling-0.011
 lav-0.011
 why-0.011
Token,
Token ID11
Feature activation+9.62e-03

Pos logit contributions
apped+0.044
ay+0.041
enge+0.041
amed+0.040
umbled+0.039
Neg logit contributions
eling-0.040
land-0.037
itive-0.037
 tale-0.036
 lav-0.035
Token "
Token ID366
Feature activation+1.46e-02

Pos logit contributions
eling+0.014
 tale+0.013
itive+0.013
land+0.013
 lav+0.012
Neg logit contributions
apped-0.016
amed-0.015
enge-0.014
umbled-0.014
ay-0.014
TokenWhere
Token ID8496
Feature activation+3.69e-02

Pos logit contributions
eling+0.018
 tale+0.017
land+0.017
itive+0.017
 lav+0.016
Neg logit contributions
apped-0.026
enge-0.025
amed-0.025
ay-0.024
umbled-0.024
Token are
Token ID389
Feature activation-1.81e-02

Pos logit contributions
 why+0.048
 seem+0.046
eling+0.045
land+0.045
 tale+0.044
Neg logit contributions
amed-0.065
apped-0.064
umbled-0.063
enge-0.062
amer-0.059
Token you
Token ID345
Feature activation+1.94e-03

Pos logit contributions
apped+0.029
enge+0.028
amed+0.028
umbled+0.027
ay+0.027
Neg logit contributions
eling-0.025
land-0.025
 tale-0.024
itive-0.024
 why-0.024
Token all
Token ID477
Feature activation-4.25e-02

Pos logit contributions
eling+0.003
land+0.003
 tale+0.002
itive+0.002
 lav+0.002
Neg logit contributions
apped-0.003
enge-0.003
amed-0.003
ay-0.003
umbled-0.003
Token?"
Token ID1701
Feature activation+1.00e-03

Pos logit contributions
apped+0.082
enge+0.079
umbled+0.076
amed+0.076
ay+0.076
Neg logit contributions
eling-0.051
land-0.047
 tale-0.047
 why-0.046
 gn-0.046
Token
Token ID198
Feature activation-6.29e-03

Pos logit contributions
eling+0.001
itive+0.001
land+0.001
 tale+0.001
 lav+0.001
Neg logit contributions
apped-0.002
enge-0.002
umbled-0.001
amed-0.001
ay-0.001
Token
Token ID198
Feature activation+7.96e-04

Pos logit contributions
apped+0.014
enge+0.013
umbled+0.013
amed+0.013
ay+0.013
Neg logit contributions
land-0.005
 why-0.005
 gn-0.005
 tale-0.005
eling-0.004
TokenHe
Token ID1544
Feature activation+1.32e-04

Pos logit contributions
land+0.001
eling+0.001
 tale+0.001
itive+0.001
 why+0.001
Neg logit contributions
apped-0.002
enge-0.001
ay-0.001
umbled-0.001
amed-0.001
Token waited
Token ID13488
Feature activation+8.20e-03

Pos logit contributions
eling+0.000
land+0.000
 tale+0.000
itive+0.000
 why+0.000
Neg logit contributions
apped-0.000
enge-0.000
ay-0.000
umbled-0.000
amed-0.000
Token ruler
Token ID22740
Feature activation-8.81e-03

Pos logit contributions
eling+0.002
itive+0.002
land+0.002
 why+0.002
 tale+0.002
Neg logit contributions
apped-0.002
ay-0.002
enge-0.001
amed-0.001
ite-0.001
Token to
Token ID284
Feature activation+5.11e-03

Pos logit contributions
apped+0.016
enge+0.016
amed+0.015
umbled+0.015
ay+0.015
Neg logit contributions
land-0.010
 why-0.010
 tale-0.010
itive-0.010
 hesitation-0.010
Token measure
Token ID3953
Feature activation-9.00e-03

Pos logit contributions
eling+0.006
land+0.006
itive+0.005
 tale+0.005
 lav+0.005
Neg logit contributions
apped-0.010
enge-0.009
amed-0.009
ay-0.009
umbled-0.009
Token how
Token ID703
Feature activation+2.63e-03

Pos logit contributions
apped+0.014
enge+0.013
ay+0.013
umbled+0.013
amed+0.013
Neg logit contributions
eling-0.013
 tale-0.013
itive-0.012
land-0.012
 gn-0.012
Token big
Token ID1263
Feature activation-2.94e-02

Pos logit contributions
eling+0.004
land+0.003
 why+0.003
itive+0.003
 lav+0.003
Neg logit contributions
apped-0.004
enge-0.004
amed-0.004
ay-0.004
umbled-0.004
Token it
Token ID340
Feature activation-4.50e-02

Pos logit contributions
apped+0.045
enge+0.043
ay+0.042
amed+0.041
umbled+0.041
Neg logit contributions
eling-0.043
land-0.042
itive-0.042
 tale-0.041
 gn-0.039
Token was
Token ID373
Feature activation-4.02e-03

Pos logit contributions
apped+0.066
enge+0.062
ay+0.061
amed+0.060
umbled+0.060
Neg logit contributions
eling-0.071
land-0.068
itive-0.067
 tale-0.067
 gn-0.065
Token.
Token ID13
Feature activation-3.78e-03

Pos logit contributions
apped+0.006
amed+0.006
umbled+0.006
ay+0.006
enge+0.006
Neg logit contributions
eling-0.005
itive-0.005
 tale-0.005
land-0.005
 tul-0.005
Token The
Token ID383
Feature activation+6.00e-03

Pos logit contributions
apped+0.006
enge+0.006
amed+0.006
ay+0.006
umbled+0.006
Neg logit contributions
eling-0.005
land-0.005
 tale-0.005
itive-0.005
 lav-0.005
Token sw
Token ID1509
Feature activation-1.47e-02

Pos logit contributions
eling+0.009
itive+0.009
land+0.009
 why+0.009
 scr+0.008
Neg logit contributions
apped-0.009
amed-0.008
ay-0.008
enge-0.008
umbled-0.008
Tokenan
Token ID272
Feature activation-1.43e-02

Pos logit contributions
apped+0.023
ay+0.022
enge+0.022
amed+0.021
umbled+0.021
Neg logit contributions
eling-0.020
land-0.020
 tale-0.019
itive-0.019
 lav-0.018
Token help
Token ID1037
Feature activation-2.23e-02

Pos logit contributions
apped+0.001
enge+0.001
amed+0.001
umbled+0.001
ay+0.001
Neg logit contributions
eling-0.001
land-0.001
 tale-0.001
itive-0.001
 lav-0.001
Token you
Token ID345
Feature activation-6.58e-03

Pos logit contributions
apped+0.041
enge+0.039
amed+0.039
umbled+0.038
ay+0.036
Neg logit contributions
eling-0.028
 why-0.028
land-0.027
 tale-0.026
itive-0.025
Token make
Token ID787
Feature activation-9.22e-03

Pos logit contributions
apped+0.009
ay+0.008
amed+0.008
enge+0.008
box+0.008
Neg logit contributions
eling-0.010
 tale-0.010
itive-0.010
 hesitation-0.010
land-0.009
Token a
Token ID257
Feature activation-8.85e-03

Pos logit contributions
apped+0.016
enge+0.015
ay+0.015
amed+0.015
umbled+0.015
Neg logit contributions
eling-0.012
land-0.012
itive-0.011
 tale-0.011
 lav-0.011
Token new
Token ID649
Feature activation-4.02e-02

Pos logit contributions
apped+0.014
ay+0.014
enge+0.014
amed+0.013
umbled+0.013
Neg logit contributions
eling-0.012
itive-0.012
land-0.011
 hesitation-0.011
 tale-0.011
Token one
Token ID530
Feature activation-5.39e-02

Pos logit contributions
apped+0.066
enge+0.063
ay+0.061
amed+0.061
umbled+0.060
Neg logit contributions
eling-0.056
land-0.054
itive-0.053
 tale-0.053
 lav-0.051
Token?
Token ID30
Feature activation+4.06e-03

Pos logit contributions
apped+0.080
ay+0.078
enge+0.076
umbled+0.071
amed+0.071
Neg logit contributions
eling-0.082
itive-0.078
land-0.078
 hesitation-0.075
 gn-0.075
Token I
Token ID314
Feature activation+8.05e-03

Pos logit contributions
eling+0.006
land+0.006
 tale+0.006
itive+0.006
 lav+0.006
Neg logit contributions
apped-0.006
enge-0.006
amed-0.005
umbled-0.005
ay-0.005
Token know
Token ID760
Feature activation-6.68e-03

Pos logit contributions
eling+0.011
land+0.010
 tale+0.010
itive+0.010
 why+0.010
Neg logit contributions
apped-0.014
enge-0.013
amed-0.013
umbled-0.013
ay-0.013
Token a
Token ID257
Feature activation-1.55e-02

Pos logit contributions
apped+0.011
enge+0.010
ay+0.010
amed+0.010
umbled+0.010
Neg logit contributions
eling-0.009
itive-0.009
 tale-0.009
land-0.009
 gn-0.008
Token good
Token ID922
Feature activation-8.67e-03

Pos logit contributions
ay+0.023
apped+0.023
enge+0.023
box+0.021
amed+0.021
Neg logit contributions
eling-0.024
itive-0.022
land-0.020
 tale-0.019
 hesitation-0.019
Token something
Token ID1223
Feature activation+2.00e-02

Pos logit contributions
apped+0.020
umbled+0.019
enge+0.019
amed+0.019
ay+0.019
Neg logit contributions
eling-0.010
land-0.010
 tale-0.009
 why-0.009
 lav-0.009
Token special
Token ID2041
Feature activation-2.34e-02

Pos logit contributions
eling+0.031
itive+0.030
land+0.028
 tale+0.028
 hesitation+0.027
Neg logit contributions
apped-0.028
enge-0.028
ay-0.028
amed-0.026
umbled-0.025
Token,
Token ID11
Feature activation-9.29e-03

Pos logit contributions
apped+0.048
enge+0.046
amed+0.046
ay+0.045
umbled+0.045
Neg logit contributions
eling-0.022
land-0.022
 tale-0.021
itive-0.021
 why-0.020
Token something
Token ID1223
Feature activation+2.78e-02

Pos logit contributions
apped+0.017
ay+0.016
enge+0.016
amed+0.016
umbled+0.015
Neg logit contributions
eling-0.011
itive-0.010
land-0.010
 tale-0.010
 gn-0.009
Token no
Token ID645
Feature activation-1.30e-02

Pos logit contributions
itive+0.041
eling+0.041
land+0.039
 tale+0.037
 hesitation+0.036
Neg logit contributions
apped-0.041
enge-0.040
amed-0.038
ay-0.038
umbled-0.037
Token one
Token ID530
Feature activation-5.13e-02

Pos logit contributions
ay+0.020
enge+0.020
apped+0.019
 of+0.018
ied+0.018
Neg logit contributions
eling-0.018
 symp-0.017
 scr-0.016
itive-0.016
Far-0.016
Token else
Token ID2073
Feature activation-1.36e-02

Pos logit contributions
apped+0.081
ay+0.078
amed+0.078
enge+0.076
umbled+0.075
Neg logit contributions
 why-0.074
eling-0.072
land-0.071
 tale-0.071
 lav-0.070
Token had
Token ID550
Feature activation-1.24e-02

Pos logit contributions
apped+0.021
ay+0.020
enge+0.020
amed+0.020
umbled+0.019
Neg logit contributions
 why-0.021
eling-0.020
land-0.020
 tale-0.020
 seem-0.019
Token.
Token ID13
Feature activation-2.34e-03

Pos logit contributions
apped+0.016
enge+0.016
ay+0.015
amed+0.014
umbled+0.014
Neg logit contributions
land-0.022
eling-0.022
 tale-0.021
itive-0.021
 why-0.020
Token So
Token ID1406
Feature activation+4.51e-03

Pos logit contributions
apped+0.004
enge+0.004
ay+0.004
umbled+0.003
amed+0.003
Neg logit contributions
eling-0.004
itive-0.003
 tale-0.003
land-0.003
 lav-0.003
Token one
Token ID530
Feature activation-3.42e-02

Pos logit contributions
eling+0.006
land+0.006
itive+0.006
 tale+0.006
uf+0.005
Neg logit contributions
apped-0.008
ay-0.007
enge-0.007
umbled-0.007
amed-0.007