TOP ACTIVATIONS
MAX = 0.125

, " Thank you , Lucy ! You are so kind
owing in the wind . Lucy started jumping up and down
" We love you , Lucy . You are beautiful and
and had no home . Lucy thought that she should take
From that day on , Lucy kept guessing how far she
The next day , Lucy and her friends went to
From that moment on , Lucy marked all her w affles
And so , Lucy and Peter left the forest
The next day , Samantha was feeling much better .
." So , Lucy and her mom bought the
a question . â Lucy looked confused and asked
like this , â Lucy said . â
No I don 't ," Lucy answered .
" It 's okay , Lucy . We all make mistakes
that a clever girl like Lucy would never allow others to
With that , Lucy and her mom left the
't move , Tom ," Lucy mentioned . " I can
looked sad and lonely . Lucy wanted to make him feel
" Let 's compromise . Lucy can play on the rail
Fin was very happy . Lucy felt like she had rescued

MIN ACTIVATIONS
MIN = -0.201

let out a big sn arl and the mouse knew he
" Hello , my d arl ings . I 'm so
What happened , my d arl ings ? Why do you
perfect fish , my d arl ings ," mom says .
" Hello , my d arl ings . I 'm so
you okay , my d arl ings ?" she asked .
you go , my d arl ings ," Mom said .
to grow l and sn arl to try and scare her
" Hello , my d arl ings . I 'm so
disappointment . M arl on 's mum hugged him
Not now , my d arl ings , it is too
What happened , my d arl ings ? Why are you
'm sorry , my d arl ings . The tornado took
We know , my d arl ings . We miss them
Spot grow ls and sn arl s . He tries to
'm sorry , my d arl ings . That was a
you okay , my d arl ings ?" Mom asked .
you too , my d arl ings . Now let 's
's good , my d arl ings . I 'm glad
there was a 4 - year old girl named Sus ie

INTERVAL 0.044 - 0.125
CONTAINS 4.091%

. He thought for a while and then he got an
Once there was an angry kitten . It felt so mad
past ries . He has sold them all to other children
said , " It 's still a fun toy to play
loves children . You can pet him if you want ."

INTERVAL -0.038 - 0.044
CONTAINS 90.236%

it safe . Rose thanked the man , put her hand
and splash into the water . The bear follows them .
really brave and he liked to explore !
of her mom for helping her and she wasn 't ashamed
it all the way to the top of the cable and

INTERVAL -0.119 - -0.038
CONTAINS 5.660%

started spilling everywhere . M andy was scared and felt bad
the fun attractions . They realized that the park was the
. " That is why you must be careful ". Joe
the big engine . She knew that taking a bath was
was amazed ! J enny shouted with joy as

INTERVAL -0.201 - -0.119
CONTAINS 0.013%

A 3 - year - old toddler walked up
Jen feels even sad der . She thinks Tim is
John said , " Care ful , Mom my !"
They said , " Ar rr , mate y , we
so happy that Ph oe be was so clever and forgiving
Token,
Token ID11
Feature activation+5.04e-02

Pos logit contributions
eling+0.001
ited+0.001
ries+0.000
 unw+0.000
ivering+0.000
Neg logit contributions
 Their-0.001
 many-0.000
ily-0.000
 Some-0.000
 lesson-0.000
Token "
Token ID366
Feature activation+4.29e-02

Pos logit contributions
eling+0.084
ited+0.075
!+0.075
 unw+0.064
!"+0.061
Neg logit contributions
 Their-0.081
 Some-0.079
arl-0.075
itted-0.075
year-0.074
TokenThank
Token ID10449
Feature activation+4.14e-02

Pos logit contributions
ited+0.077
eling+0.077
!+0.064
 unw+0.061
 bit+0.061
Neg logit contributions
 Their-0.061
arl-0.059
avy-0.058
 Some-0.057
itted-0.056
Token you
Token ID345
Feature activation-2.63e-02

Pos logit contributions
eling+0.054
ited+0.048
!+0.044
 unw+0.035
!"+0.033
Neg logit contributions
 Their-0.085
 Some-0.082
arl-0.078
itted-0.077
ily-0.077
Token,
Token ID11
Feature activation+1.63e-02

Pos logit contributions
 Their+0.052
 Some+0.051
ily+0.048
arl+0.048
itted+0.047
Neg logit contributions
eling-0.036
ited-0.032
!-0.028
 unw-0.024
 transported-0.023
Token Lucy
Token ID22162
Feature activation+1.20e-01

Pos logit contributions
eling+0.030
!+0.029
ited+0.028
 unw+0.025
!"+0.025
Neg logit contributions
 Their-0.022
itted-0.021
 Some-0.021
year-0.020
arl-0.019
Token!
Token ID0
Feature activation-7.44e-03

Pos logit contributions
eling+0.148
ited+0.131
 unw+0.092
 transported+0.091
ge+0.087
Neg logit contributions
 Their-0.256
ily-0.247
 Some-0.246
 lesson-0.237
 Everyone-0.236
Token You
Token ID921
Feature activation-1.72e-02

Pos logit contributions
 Their+0.011
utes+0.010
 lesson+0.010
itted+0.010
 Some+0.010
Neg logit contributions
eling-0.012
!-0.011
ited-0.011
!"-0.010
 unw-0.009
Token are
Token ID389
Feature activation+5.91e-02

Pos logit contributions
ily+0.026
 Their+0.025
 Some+0.025
 Everyone+0.024
 lesson+0.023
Neg logit contributions
eling-0.031
ited-0.028
ge-0.026
ries-0.025
ivering-0.024
Token so
Token ID523
Feature activation-1.33e-02

Pos logit contributions
ited+0.129
eling+0.127
ge+0.110
ivering+0.102
!+0.101
Neg logit contributions
 Their-0.072
ily-0.067
 many-0.065
 Some-0.064
 Everyone-0.061
Token kind
Token ID1611
Feature activation-1.28e-02

Pos logit contributions
 Their+0.018
 Some+0.017
ily+0.016
year+0.016
arl+0.016
Neg logit contributions
eling-0.027
ited-0.025
!-0.022
ge-0.021
 unw-0.021
Tokenowing
Token ID7855
Feature activation+2.26e-02

Pos logit contributions
 Their+0.021
 Some+0.020
arl+0.019
ily+0.019
itted+0.019
Neg logit contributions
eling-0.015
ited-0.013
!-0.012
 unw-0.010
ge-0.010
Token in
Token ID287
Feature activation-2.09e-03

Pos logit contributions
eling+0.033
ited+0.030
!+0.026
 unw+0.023
 transported+0.022
Neg logit contributions
 Their-0.043
 Some-0.041
arl-0.039
ily-0.039
itted-0.038
Token the
Token ID262
Feature activation+1.93e-02

Pos logit contributions
 Their+0.003
 Some+0.003
 many+0.003
ily+0.003
arl+0.003
Neg logit contributions
eling-0.004
ited-0.004
ivering-0.003
!-0.003
 unw-0.003
Token wind
Token ID2344
Feature activation-1.79e-02

Pos logit contributions
eling+0.029
ited+0.026
!+0.025
 unw+0.020
 bit+0.020
Neg logit contributions
 Their-0.036
 Some-0.034
ily-0.033
itted-0.033
arl-0.033
Token.
Token ID13
Feature activation+6.03e-03

Pos logit contributions
 Their+0.034
 Some+0.032
arl+0.032
itted+0.032
ily+0.032
Neg logit contributions
eling-0.024
ited-0.022
 unw-0.017
!-0.017
 transported-0.016
Token Lucy
Token ID22162
Feature activation+1.19e-01

Pos logit contributions
!+0.011
eling+0.011
ited+0.010
 unw+0.010
!"+0.009
Neg logit contributions
 lesson-0.007
arl-0.007
oys-0.007
utes-0.007
year-0.007
Token started
Token ID2067
Feature activation-2.40e-02

Pos logit contributions
eling+0.192
ited+0.176
!+0.148
 transported+0.147
ries+0.143
Neg logit contributions
 Their-0.189
 Some-0.176
 Everyone-0.167
ily-0.167
 Now-0.166
Token jumping
Token ID14284
Feature activation-1.53e-02

Pos logit contributions
 Their+0.031
 Some+0.030
arl+0.027
ily+0.027
itted+0.026
Neg logit contributions
eling-0.049
ited-0.046
!-0.042
 unw-0.038
ivering-0.037
Token up
Token ID510
Feature activation-6.00e-03

Pos logit contributions
 Their+0.029
 Some+0.029
ily+0.027
arl+0.027
itted+0.026
Neg logit contributions
eling-0.022
ited-0.019
!-0.017
 unw-0.015
 transported-0.014
Token and
Token ID290
Feature activation-2.59e-02

Pos logit contributions
 Their+0.011
 Some+0.010
ily+0.009
 lesson+0.009
arl+0.009
Neg logit contributions
eling-0.010
ited-0.009
!-0.007
ivering-0.007
 transported-0.007
Token down
Token ID866
Feature activation-2.47e-02

Pos logit contributions
 Their+0.046
 Some+0.043
 Everyone+0.041
ily+0.040
itted+0.040
Neg logit contributions
eling-0.039
ited-0.036
!-0.030
ivering-0.028
 unw-0.028
Token "
Token ID366
Feature activation+1.78e-02

Pos logit contributions
eling+0.015
!+0.015
ited+0.014
 unw+0.013
!"+0.012
Neg logit contributions
 Some-0.007
 Their-0.007
year-0.007
arl-0.007
ily-0.007
TokenWe
Token ID1135
Feature activation+9.43e-03

Pos logit contributions
ited+0.032
eling+0.030
!+0.027
 unw+0.024
 bit+0.024
Neg logit contributions
 Their-0.026
arl-0.025
avy-0.024
 Some-0.024
 many-0.023
Token love
Token ID1842
Feature activation+4.55e-02

Pos logit contributions
eling+0.014
ited+0.013
!+0.012
 unw+0.010
 transported+0.010
Neg logit contributions
 Their-0.017
 Some-0.017
ily-0.016
arl-0.015
itted-0.015
Token you
Token ID345
Feature activation-3.20e-02

Pos logit contributions
eling+0.066
ited+0.060
!+0.054
 unw+0.046
 transported+0.044
Neg logit contributions
 Their-0.087
 Some-0.084
arl-0.079
ily-0.079
itted-0.078
Token,
Token ID11
Feature activation+1.01e-02

Pos logit contributions
 Their+0.058
 Some+0.057
 many+0.054
 lesson+0.053
 Everyone+0.053
Neg logit contributions
eling-0.051
ited-0.046
 transported-0.036
ge-0.036
 unw-0.035
Token Lucy
Token ID22162
Feature activation+1.18e-01

Pos logit contributions
eling+0.020
ited+0.018
!+0.018
 unw+0.016
 transported+0.015
Neg logit contributions
 Their-0.014
 Some-0.013
itted-0.012
arl-0.012
year-0.012
Token.
Token ID13
Feature activation+1.42e-02

Pos logit contributions
eling+0.182
ited+0.167
 transported+0.130
 unw+0.128
ge+0.118
Neg logit contributions
 Their-0.220
ily-0.209
 Some-0.208
oys-0.202
arl-0.198
Token You
Token ID921
Feature activation-1.80e-02

Pos logit contributions
eling+0.021
!+0.021
ited+0.019
!"+0.018
 unw+0.017
Neg logit contributions
 Their-0.021
itted-0.021
 lesson-0.020
utes-0.020
ily-0.020
Token are
Token ID389
Feature activation+5.41e-02

Pos logit contributions
 Their+0.027
 Some+0.027
ily+0.026
 Everyone+0.025
 lesson+0.024
Neg logit contributions
eling-0.033
ited-0.030
ge-0.027
ivering-0.026
ries-0.026
Token beautiful
Token ID4950
Feature activation+1.19e-02

Pos logit contributions
ited+0.116
eling+0.114
ge+0.098
!+0.093
ivering+0.093
Neg logit contributions
 Their-0.069
ily-0.062
 Some-0.061
 many-0.060
 Everyone-0.057
Token and
Token ID290
Feature activation+2.21e-03

Pos logit contributions
eling+0.018
ited+0.017
ge+0.013
 unw+0.012
 transported+0.012
Neg logit contributions
 Some-0.022
 Their-0.022
ily-0.022
 Everyone-0.021
arl-0.021
Token and
Token ID290
Feature activation+1.60e-02

Pos logit contributions
 Their+0.031
 Some+0.030
year+0.029
arl+0.028
itted+0.028
Neg logit contributions
eling-0.018
!-0.016
ited-0.016
!"-0.012
 unw-0.012
Token had
Token ID550
Feature activation-1.23e-02

Pos logit contributions
eling+0.023
ited+0.020
!+0.019
 unw+0.016
 bit+0.015
Neg logit contributions
 Their-0.030
 Some-0.029
arl-0.028
itted-0.028
oys-0.028
Token no
Token ID645
Feature activation+1.91e-02

Pos logit contributions
 Their+0.018
arl+0.017
 Some+0.017
ily+0.017
year+0.016
Neg logit contributions
eling-0.022
ited-0.020
!-0.020
 unw-0.017
 transported-0.017
Token home
Token ID1363
Feature activation-2.25e-02

Pos logit contributions
eling+0.033
ited+0.030
!+0.028
 unw+0.025
 transported+0.024
Neg logit contributions
 Their-0.031
 Some-0.030
arl-0.028
ily-0.028
itted-0.027
Token.
Token ID13
Feature activation+1.26e-02

Pos logit contributions
 Their+0.040
ily+0.038
 Some+0.038
 many+0.037
oys+0.036
Neg logit contributions
eling-0.035
ited-0.033
 unw-0.024
ivering-0.023
 transported-0.022
Token Lucy
Token ID22162
Feature activation+1.17e-01

Pos logit contributions
!+0.021
eling+0.021
ited+0.019
!"+0.018
 unw+0.018
Neg logit contributions
 lesson-0.016
itted-0.016
avy-0.016
utes-0.015
arl-0.015
Token thought
Token ID1807
Feature activation-5.04e-02

Pos logit contributions
eling+0.217
ited+0.200
ge+0.169
ries+0.167
 avail+0.164
Neg logit contributions
 Their-0.162
ily-0.150
 Some-0.148
 Now-0.141
 Everyone-0.139
Token that
Token ID326
Feature activation-2.61e-02

Pos logit contributions
 Their+0.087
 Some+0.084
arl+0.082
itted+0.080
oys+0.078
Neg logit contributions
eling-0.081
ited-0.071
!-0.070
 unw-0.060
 transported-0.055
Token she
Token ID673
Feature activation+3.46e-02

Pos logit contributions
 Their+0.046
 Some+0.045
ily+0.042
 lesson+0.041
 Everyone+0.041
Neg logit contributions
eling-0.041
ited-0.038
!-0.033
 unw-0.030
ivering-0.029
Token should
Token ID815
Feature activation-2.24e-02

Pos logit contributions
eling+0.067
ited+0.061
!+0.053
ge+0.051
ivering+0.051
Neg logit contributions
 Their-0.048
 Some-0.046
ily-0.045
year-0.043
 lesson-0.042
Token take
Token ID1011
Feature activation-1.31e-02

Pos logit contributions
 Their+0.039
 Some+0.038
arl+0.035
ily+0.035
itted+0.035
Neg logit contributions
eling-0.036
ited-0.033
!-0.030
 unw-0.026
 transported-0.025
Token From
Token ID3574
Feature activation-2.33e-02

Pos logit contributions
!+0.032
eling+0.031
ited+0.029
!"+0.028
 unw+0.027
Neg logit contributions
 lesson-0.022
arl-0.022
itted-0.022
avy-0.021
oys-0.021
Token that
Token ID326
Feature activation-2.21e-02

Pos logit contributions
 Some+0.040
 Their+0.040
itted+0.036
arl+0.036
ily+0.035
Neg logit contributions
eling-0.040
ited-0.035
!-0.035
 unw-0.029
!"-0.028
Token day
Token ID1110
Feature activation+5.53e-03

Pos logit contributions
 Their+0.023
 Some+0.021
arl+0.020
itted+0.019
ily+0.019
Neg logit contributions
eling-0.052
ited-0.049
!-0.048
!"-0.042
 bit-0.041
Token on
Token ID319
Feature activation-1.29e-02

Pos logit contributions
eling+0.009
!+0.009
ited+0.008
!"+0.007
 transported+0.006
Neg logit contributions
 Their-0.010
 Some-0.009
itted-0.008
utes-0.008
arl-0.008
Token,
Token ID11
Feature activation+9.75e-03

Pos logit contributions
 Their+0.027
 Some+0.025
ily+0.025
 Everyone+0.024
 lesson+0.024
Neg logit contributions
eling-0.017
ited-0.016
!-0.011
 unw-0.011
ivering-0.011
Token Lucy
Token ID22162
Feature activation+1.17e-01

Pos logit contributions
eling+0.017
!+0.015
ited+0.015
 unw+0.013
ge+0.012
Neg logit contributions
 Their-0.015
arl-0.015
 Some-0.015
itted-0.014
year-0.014
Token kept
Token ID4030
Feature activation+9.98e-03

Pos logit contributions
eling+0.212
ited+0.197
!+0.172
 unw+0.161
ivering+0.156
Neg logit contributions
 Their-0.175
 Some-0.167
ily-0.159
itted-0.155
 Everyone-0.151
Token guessing
Token ID25260
Feature activation-3.47e-02

Pos logit contributions
eling+0.014
!+0.012
ited+0.012
 unw+0.010
 transported+0.010
Neg logit contributions
 Their-0.020
 Some-0.020
arl-0.019
oys-0.019
ily-0.018
Token how
Token ID703
Feature activation-1.96e-02

Pos logit contributions
 Their+0.058
 Some+0.056
arl+0.054
itted+0.052
ily+0.051
Neg logit contributions
eling-0.058
ited-0.052
!-0.051
 unw-0.043
 transported-0.041
Token far
Token ID1290
Feature activation+2.49e-02

Pos logit contributions
 Their+0.031
 Some+0.029
arl+0.028
tr+0.027
itted+0.027
Neg logit contributions
eling-0.035
ited-0.031
!-0.031
 unw-0.026
 transported-0.026
Token she
Token ID673
Feature activation+2.49e-02

Pos logit contributions
eling+0.040
ited+0.037
!+0.033
 unw+0.029
 transported+0.028
Neg logit contributions
 Their-0.043
 Some-0.041
arl-0.039
ily-0.039
itted-0.038
Token
Token ID198
Feature activation+5.68e-03

Pos logit contributions
eling+0.016
ited+0.015
!+0.013
 unw+0.011
ivering+0.011
Neg logit contributions
 Their-0.018
 Some-0.017
ily-0.016
 lesson-0.016
itted-0.016
TokenThe
Token ID464
Feature activation+1.83e-03

Pos logit contributions
eling+0.009
!+0.009
ited+0.009
 unw+0.007
!"+0.007
Neg logit contributions
 Some-0.008
itted-0.008
tr-0.008
arl-0.008
ometer-0.008
Token next
Token ID1306
Feature activation+8.67e-03

Pos logit contributions
eling+0.003
!+0.003
ited+0.003
 bit+0.002
 mess+0.002
Neg logit contributions
 Their-0.003
 Some-0.003
itted-0.003
year-0.003
arl-0.003
Token day
Token ID1110
Feature activation+4.80e-04

Pos logit contributions
!+0.017
ited+0.015
eling+0.015
 bit+0.014
!"+0.014
Neg logit contributions
 Their-0.012
 Some-0.011
arl-0.011
itted-0.011
year-0.011
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
eling+0.001
ited+0.001
ivering+0.000
!+0.000
 unw+0.000
Neg logit contributions
 Their-0.001
 Some-0.001
ily-0.001
 Everyone-0.001
oys-0.001
Token Lucy
Token ID22162
Feature activation+1.17e-01

Pos logit contributions
eling+0.019
ited+0.017
!+0.016
 unw+0.015
 transported+0.014
Neg logit contributions
 Their-0.015
 Some-0.015
arl-0.014
ily-0.014
itted-0.014
Token and
Token ID290
Feature activation-1.64e-03

Pos logit contributions
eling+0.210
ited+0.190
ivering+0.160
ries+0.155
ge+0.153
Neg logit contributions
 Their-0.178
ily-0.160
 Some-0.160
 Now-0.148
 Everyone-0.147
Token her
Token ID607
Feature activation+2.75e-02

Pos logit contributions
 Some+0.003
arl+0.003
 Their+0.003
itted+0.003
utes+0.003
Neg logit contributions
eling-0.003
!-0.002
ited-0.002
 unw-0.002
!"-0.002
Token friends
Token ID2460
Feature activation-1.88e-02

Pos logit contributions
eling+0.049
!+0.048
ited+0.045
 unw+0.042
 bit+0.040
Neg logit contributions
 Some-0.039
 Their-0.038
year-0.037
arl-0.036
itted-0.035
Token went
Token ID1816
Feature activation-2.80e-02

Pos logit contributions
 Their+0.032
 Some+0.030
ily+0.029
arl+0.028
 Everyone+0.028
Neg logit contributions
eling-0.031
ited-0.028
!-0.025
 unw-0.023
ivering-0.022
Token to
Token ID284
Feature activation-1.04e-03

Pos logit contributions
 Some+0.046
 Their+0.045
ily+0.044
year+0.043
arl+0.043
Neg logit contributions
!-0.042
eling-0.040
ited-0.036
 unw-0.034
 transported-0.033
Token From
Token ID3574
Feature activation-2.31e-02

Pos logit contributions
!+0.042
eling+0.039
ited+0.037
!"+0.036
 unw+0.035
Neg logit contributions
arl-0.028
 lesson-0.028
avy-0.027
itted-0.027
year-0.026
Token that
Token ID326
Feature activation-2.01e-02

Pos logit contributions
 Some+0.038
 Their+0.038
itted+0.035
arl+0.034
ily+0.034
Neg logit contributions
eling-0.041
ited-0.036
!-0.036
 unw-0.030
!"-0.029
Token moment
Token ID2589
Feature activation+4.62e-02

Pos logit contributions
 Their+0.022
 Some+0.021
arl+0.020
itted+0.020
ily+0.019
Neg logit contributions
eling-0.045
!-0.043
ited-0.043
!"-0.037
 bit-0.036
Token on
Token ID319
Feature activation-1.45e-02

Pos logit contributions
eling+0.059
!+0.057
ited+0.051
!"+0.042
 unw+0.042
Neg logit contributions
 Their-0.093
 Some-0.091
arl-0.085
itted-0.084
utes-0.083
Token,
Token ID11
Feature activation+5.38e-03

Pos logit contributions
 Their+0.031
 Some+0.030
ily+0.028
 Everyone+0.028
 lesson+0.028
Neg logit contributions
eling-0.018
ited-0.017
!-0.013
 unw-0.012
ivering-0.011
Token Lucy
Token ID22162
Feature activation+1.16e-01

Pos logit contributions
eling+0.009
ited+0.008
!+0.008
 unw+0.007
 transported+0.006
Neg logit contributions
 Their-0.009
 Some-0.008
arl-0.008
itted-0.008
year-0.008
Token marked
Token ID7498
Feature activation-5.35e-02

Pos logit contributions
eling+0.186
ited+0.169
!+0.148
 unw+0.133
ivering+0.127
Neg logit contributions
 Their-0.200
 Some-0.192
ily-0.184
itted-0.180
year-0.177
Token all
Token ID477
Feature activation-2.84e-03

Pos logit contributions
 Their+0.111
arl+0.107
 Some+0.105
itted+0.101
oys+0.100
Neg logit contributions
eling-0.071
!-0.065
ited-0.055
 unw-0.049
ries-0.048
Token her
Token ID607
Feature activation+1.84e-02

Pos logit contributions
 Their+0.005
 Some+0.005
arl+0.004
ily+0.004
itted+0.004
Neg logit contributions
eling-0.005
ited-0.004
!-0.004
 unw-0.003
 transported-0.003
Token w
Token ID266
Feature activation+7.97e-03

Pos logit contributions
eling+0.033
ited+0.030
!+0.029
 unw+0.026
 transported+0.024
Neg logit contributions
 Their-0.029
 Some-0.028
arl-0.027
ily-0.026
year-0.025
Tokenaffles
Token ID48501
Feature activation-6.21e-02

Pos logit contributions
eling+0.017
ited+0.015
!+0.014
 unw+0.013
ivering+0.012
Neg logit contributions
 Their-0.011
 Some-0.010
tr-0.009
arl-0.009
 lesson-0.009
Token
Token ID198
Feature activation-1.14e-02

Pos logit contributions
 lesson+0.008
arl+0.008
year+0.008
avy+0.008
itted+0.008
Neg logit contributions
!-0.011
eling-0.011
ited-0.010
 unw-0.010
!"-0.009
Token
Token ID198
Feature activation-8.01e-03

Pos logit contributions
itted+0.018
tr+0.018
 Some+0.017
arl+0.017
avy+0.016
Neg logit contributions
eling-0.017
!-0.017
ited-0.015
 unw-0.014
!"-0.014
TokenAnd
Token ID1870
Feature activation+1.42e-02

Pos logit contributions
 Some+0.011
itted+0.011
 Their+0.011
arl+0.011
tr+0.011
Neg logit contributions
eling-0.014
ited-0.013
!-0.013
 unw-0.011
ivering-0.011
Token so
Token ID523
Feature activation-2.14e-02

Pos logit contributions
!+0.019
eling+0.019
ited+0.017
 unw+0.015
!"+0.014
Neg logit contributions
 Some-0.027
arl-0.027
 Their-0.026
itted-0.025
utes-0.024
Token,
Token ID11
Feature activation+8.90e-04

Pos logit contributions
 Some+0.038
 Their+0.037
arl+0.036
itted+0.035
ily+0.034
Neg logit contributions
eling-0.033
!-0.030
ited-0.030
!"-0.024
 unw-0.024
Token Lucy
Token ID22162
Feature activation+1.15e-01

Pos logit contributions
eling+0.002
ited+0.001
!+0.001
 unw+0.001
 transported+0.001
Neg logit contributions
 Their-0.001
 Some-0.001
arl-0.001
itted-0.001
ily-0.001
Token and
Token ID290
Feature activation-6.25e-03

Pos logit contributions
eling+0.183
ited+0.165
!+0.134
ries+0.126
 unw+0.125
Neg logit contributions
 Their-0.200
 Some-0.186
ily-0.184
 Everyone-0.179
itted-0.177
Token Peter
Token ID5613
Feature activation+2.56e-02

Pos logit contributions
 Some+0.010
arl+0.010
itted+0.010
 Their+0.010
year+0.010
Neg logit contributions
eling-0.010
!-0.009
ited-0.009
 unw-0.008
ries-0.007
Token left
Token ID1364
Feature activation-1.49e-02

Pos logit contributions
eling+0.042
ited+0.038
!+0.034
 unw+0.030
ivering+0.030
Neg logit contributions
 Their-0.043
 Some-0.041
ily-0.039
 Everyone-0.038
itted-0.038
Token the
Token ID262
Feature activation+1.18e-02

Pos logit contributions
 Their+0.024
 Some+0.023
ily+0.021
 Everyone+0.021
arl+0.021
Neg logit contributions
eling-0.027
ited-0.025
!-0.021
ivering-0.020
 unw-0.020
Token forest
Token ID8222
Feature activation+1.34e-02

Pos logit contributions
eling+0.019
ited+0.017
!+0.016
 unw+0.014
 transported+0.013
Neg logit contributions
 Their-0.020
 Some-0.020
arl-0.018
ily-0.018
itted-0.018
Token
Token ID198
Feature activation+5.68e-03

Pos logit contributions
eling+0.016
ited+0.014
!+0.012
 unw+0.011
ivering+0.010
Neg logit contributions
 Their-0.024
 Some-0.023
ily-0.023
arl-0.022
 lesson-0.022
TokenThe
Token ID464
Feature activation+1.83e-03

Pos logit contributions
eling+0.009
!+0.009
ited+0.009
 unw+0.007
!"+0.007
Neg logit contributions
 Some-0.008
itted-0.008
tr-0.008
arl-0.008
ometer-0.008
Token next
Token ID1306
Feature activation+8.67e-03

Pos logit contributions
eling+0.003
!+0.003
ited+0.003
 bit+0.002
 mess+0.002
Neg logit contributions
 Their-0.003
 Some-0.003
itted-0.003
year-0.003
arl-0.003
Token day
Token ID1110
Feature activation+4.80e-04

Pos logit contributions
!+0.017
ited+0.015
eling+0.015
 bit+0.014
!"+0.014
Neg logit contributions
 Their-0.012
 Some-0.011
arl-0.011
itted-0.011
year-0.011
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
eling+0.001
ited+0.001
ivering+0.000
!+0.000
 unw+0.000
Neg logit contributions
 Their-0.001
 Some-0.001
ily-0.001
 Everyone-0.001
oys-0.001
Token Samantha
Token ID34778
Feature activation+1.15e-01

Pos logit contributions
eling+0.019
ited+0.017
!+0.016
 unw+0.015
 transported+0.014
Neg logit contributions
 Their-0.015
 Some-0.015
arl-0.014
ily-0.014
itted-0.014
Token was
Token ID373
Feature activation+1.02e-02

Pos logit contributions
eling+0.196
ited+0.181
ivering+0.148
!+0.142
ge+0.142
Neg logit contributions
 Their-0.182
ily-0.168
 Some-0.167
 Now-0.153
 Everyone-0.153
Token feeling
Token ID4203
Feature activation+1.52e-02

Pos logit contributions
eling+0.017
ited+0.016
!+0.014
 unw+0.012
ge+0.012
Neg logit contributions
 Their-0.017
 Some-0.016
ily-0.016
year-0.015
itted-0.015
Token much
Token ID881
Feature activation+4.75e-03

Pos logit contributions
eling+0.027
ited+0.025
!+0.023
 unw+0.020
ge+0.020
Neg logit contributions
 Their-0.024
 Some-0.023
ily-0.022
year-0.021
itted-0.021
Token better
Token ID1365
Feature activation-2.29e-02

Pos logit contributions
eling+0.009
ited+0.007
ge+0.007
!+0.007
ries+0.007
Neg logit contributions
 Some-0.007
 Their-0.006
 lesson-0.006
 many-0.006
year-0.006
Token.
Token ID13
Feature activation+3.81e-04

Pos logit contributions
 Their+0.038
ily+0.036
 many+0.036
itted+0.036
 Some+0.036
Neg logit contributions
eling-0.036
ited-0.033
 unw-0.026
ries-0.025
ivering-0.024
Token."
Token ID526
Feature activation+3.09e-02

Pos logit contributions
 Their+0.062
 Some+0.060
arl+0.058
year+0.056
itted+0.056
Neg logit contributions
eling-0.069
ited-0.062
!-0.060
 unw-0.053
!"-0.049
Token
Token ID198
Feature activation+6.59e-03

Pos logit contributions
eling+0.055
!+0.054
ited+0.051
 unw+0.047
!"+0.045
Neg logit contributions
year-0.040
arl-0.040
 lesson-0.039
itted-0.038
avy-0.038
Token
Token ID198
Feature activation+1.51e-03

Pos logit contributions
eling+0.010
!+0.010
ited+0.009
 unw+0.008
!"+0.008
Neg logit contributions
itted-0.010
tr-0.010
 Some-0.010
 lesson-0.009
avy-0.009
TokenSo
Token ID2396
Feature activation-2.87e-02

Pos logit contributions
eling+0.003
ited+0.003
!+0.003
 unw+0.002
ivering+0.002
Neg logit contributions
 Some-0.002
 Their-0.002
itted-0.002
arl-0.002
 lesson-0.002
Token,
Token ID11
Feature activation+6.26e-03

Pos logit contributions
 Their+0.057
 Some+0.056
arl+0.056
year+0.054
itted+0.054
Neg logit contributions
eling-0.036
!-0.034
ited-0.032
 unw-0.026
!"-0.025
Token Lucy
Token ID22162
Feature activation+1.15e-01

Pos logit contributions
!+0.010
eling+0.010
ited+0.009
 unw+0.009
!"+0.008
Neg logit contributions
arl-0.010
itted-0.010
 Their-0.009
 Some-0.009
year-0.009
Token and
Token ID290
Feature activation-4.07e-03

Pos logit contributions
eling+0.193
ited+0.174
ivering+0.141
!+0.140
ge+0.139
Neg logit contributions
 Their-0.189
ily-0.175
 Some-0.172
 Now-0.164
itted-0.163
Token her
Token ID607
Feature activation+2.29e-02

Pos logit contributions
arl+0.007
 Some+0.007
itted+0.007
utes+0.007
 Their+0.007
Neg logit contributions
eling-0.006
!-0.006
ited-0.005
 unw-0.005
!"-0.004
Token mom
Token ID1995
Feature activation+8.63e-03

Pos logit contributions
eling+0.041
!+0.040
ited+0.038
 unw+0.034
 bit+0.033
Neg logit contributions
 Some-0.032
 Their-0.032
year-0.030
arl-0.030
itted-0.029
Token bought
Token ID5839
Feature activation+2.81e-02

Pos logit contributions
eling+0.014
ited+0.013
!+0.012
 unw+0.010
 transported+0.010
Neg logit contributions
 Their-0.015
 Some-0.014
arl-0.014
ily-0.014
itted-0.013
Token the
Token ID262
Feature activation+1.39e-02

Pos logit contributions
eling+0.048
!+0.045
ited+0.043
 unw+0.038
ries+0.036
Neg logit contributions
 Their-0.043
arl-0.043
 Some-0.041
itted-0.040
utes-0.040
Token a
Token ID257
Feature activation+3.28e-02

Pos logit contributions
eling+0.020
ited+0.018
!+0.015
ivering+0.014
ries+0.013
Neg logit contributions
 Their-0.023
 Some-0.021
ily-0.021
year-0.020
 many-0.020
Token question
Token ID1808
Feature activation+1.75e-02

Pos logit contributions
eling+0.059
ited+0.054
!+0.050
 unw+0.045
 transported+0.043
Neg logit contributions
 Their-0.051
 Some-0.049
arl-0.045
ily-0.045
itted-0.044
Token.
Token ID13
Feature activation+3.93e-03

Pos logit contributions
eling+0.021
ited+0.019
!+0.017
 unw+0.014
 transported+0.013
Neg logit contributions
 Their-0.037
 Some-0.036
arl-0.034
ily-0.034
itted-0.034
Tokenâ
Token ID22940
Feature activation+4.09e-02

Pos logit contributions
!+0.007
eling+0.007
ited+0.007
 unw+0.006
!"+0.006
Neg logit contributions
arl-0.005
 lesson-0.005
itted-0.005
utes-0.004
 Their-0.004
Token
Token ID26391
Feature activation-3.05e-02

Pos logit contributions
eling+0.077
ited+0.071
!+0.065
 unw+0.057
ivering+0.056
Neg logit contributions
 Their-0.061
 Some-0.060
arl-0.054
ily-0.054
itted-0.054
Token Lucy
Token ID22162
Feature activation+1.15e-01

Pos logit contributions
 Their+0.027
 Everyone+0.025
 Some+0.023
 Now+0.021
 They+0.020
Neg logit contributions
eling-0.080
ited-0.076
ivering-0.062
ries-0.061
 unw-0.061
Token looked
Token ID3114
Feature activation-3.11e-03

Pos logit contributions
eling+0.187
ited+0.171
ivering+0.141
!+0.140
 transported+0.138
Neg logit contributions
 Their-0.190
 Some-0.176
ily-0.169
 Everyone-0.168
 Now-0.164
Token confused
Token ID10416
Feature activation-8.07e-03

Pos logit contributions
 Their+0.005
 Some+0.005
 many+0.004
ily+0.004
year+0.004
Neg logit contributions
eling-0.006
ited-0.005
ivering-0.004
ge-0.004
ries-0.004
Token and
Token ID290
Feature activation-1.37e-02

Pos logit contributions
 Their+0.018
 Some+0.017
arl+0.017
ily+0.016
itted+0.016
Neg logit contributions
eling-0.009
ited-0.008
!-0.007
 unw-0.006
 transported-0.005
Token asked
Token ID1965
Feature activation+7.37e-03

Pos logit contributions
 Their+0.026
 Some+0.025
ily+0.023
arl+0.023
year+0.023
Neg logit contributions
eling-0.020
ited-0.018
!-0.016
 unw-0.014
ivering-0.014
Token
Token ID6184
Feature activation-1.20e-02

Pos logit contributions
eling+0.009
!+0.008
ited+0.007
 unw+0.006
!"+0.005
Neg logit contributions
 Their-0.015
arl-0.015
itted-0.015
 Some-0.015
tr-0.015
Token like
Token ID588
Feature activation+4.97e-02

Pos logit contributions
eling+0.050
ited+0.046
!+0.043
 unw+0.036
ivering+0.035
Neg logit contributions
 Their-0.054
 Some-0.052
arl-0.048
ily-0.048
itted-0.048
Token this
Token ID428
Feature activation-3.28e-02

Pos logit contributions
eling+0.082
ited+0.074
!+0.071
 unw+0.060
 transported+0.057
Neg logit contributions
 Their-0.084
 Some-0.080
arl-0.076
ily-0.076
itted-0.075
Token,
Token ID11
Feature activation+1.46e-03

Pos logit contributions
 Their+0.058
 Some+0.056
 lesson+0.053
arl+0.053
ily+0.053
Neg logit contributions
eling-0.051
ited-0.047
!-0.040
 unw-0.037
 transported-0.036
Tokenâ
Token ID22940
Feature activation+4.34e-02

Pos logit contributions
eling+0.002
ited+0.002
!+0.002
 unw+0.001
 transported+0.001
Neg logit contributions
 Their-0.003
 Some-0.003
arl-0.003
ily-0.003
itted-0.002
Token
Token ID26391
Feature activation-3.06e-02

Pos logit contributions
eling+0.086
ited+0.080
!+0.074
 unw+0.067
 transported+0.065
Neg logit contributions
 Their-0.059
 Some-0.056
arl-0.052
ily-0.052
itted-0.050
Token Lucy
Token ID22162
Feature activation+1.14e-01

Pos logit contributions
 Their+0.046
 Everyone+0.045
 Some+0.044
arl+0.043
 Now+0.042
Neg logit contributions
eling-0.057
ited-0.052
 unw-0.040
ge-0.040
ries-0.040
Token said
Token ID531
Feature activation-3.20e-03

Pos logit contributions
eling+0.146
ited+0.137
 unw+0.105
 transported+0.104
ge+0.099
Neg logit contributions
 Their-0.227
 Some-0.216
ily-0.211
 Everyone-0.209
 lesson-0.204
Token.
Token ID13
Feature activation-4.65e-03

Pos logit contributions
 Their+0.006
 Some+0.006
arl+0.006
year+0.006
itted+0.006
Neg logit contributions
eling-0.004
!-0.004
ited-0.003
 unw-0.003
!"-0.003
Token
Token ID198
Feature activation-1.21e-02

Pos logit contributions
 Their+0.006
 lesson+0.006
ily+0.006
year+0.005
itted+0.005
Neg logit contributions
eling-0.008
!-0.008
ited-0.008
!"-0.007
 unw-0.007
Token
Token ID198
Feature activation-1.04e-02

Pos logit contributions
itted+0.018
tr+0.018
 Their+0.018
year+0.017
 Some+0.017
Neg logit contributions
eling-0.018
!-0.018
ited-0.016
!"-0.015
 unw-0.014
Tokenâ
Token ID22940
Feature activation+4.15e-02

Pos logit contributions
 Their+0.015
 Some+0.015
itted+0.014
tr+0.014
 lesson+0.014
Neg logit contributions
eling-0.018
ited-0.017
!-0.017
ivering-0.014
 unw-0.014
TokenNo
Token ID2949
Feature activation+6.88e-04

Pos logit contributions
ited+0.029
eling+0.027
 unw+0.023
 bit+0.022
!+0.021
Neg logit contributions
 Their-0.031
avy-0.031
arl-0.030
 Some-0.029
utes-0.029
Token I
Token ID314
Feature activation+2.33e-02

Pos logit contributions
eling+0.001
ited+0.001
!+0.001
 transported+0.001
 unw+0.001
Neg logit contributions
 Their-0.001
 Some-0.001
 lesson-0.001
ily-0.001
arl-0.001
Token don
Token ID836
Feature activation+2.30e-02

Pos logit contributions
eling+0.041
ited+0.036
ries+0.033
ivering+0.031
 transported+0.030
Neg logit contributions
 Their-0.035
 Some-0.035
ily-0.035
 lesson-0.034
arl-0.033
Token't
Token ID470
Feature activation+3.39e-02

Pos logit contributions
ited+0.051
eling+0.051
 bas+0.039
 unw+0.039
ishment+0.039
Neg logit contributions
 Everyone-0.026
 Their-0.026
 Some-0.025
ily-0.024
arl-0.023
Token,"
Token ID553
Feature activation-2.69e-03

Pos logit contributions
eling+0.047
ited+0.041
 transported+0.033
 unw+0.031
ries+0.030
Neg logit contributions
ily-0.065
 Some-0.065
 Their-0.064
arl-0.062
 Everyone-0.062
Token Lucy
Token ID22162
Feature activation+1.14e-01

Pos logit contributions
arl+0.005
utes+0.005
itted+0.005
year+0.005
 Some+0.005
Neg logit contributions
eling-0.004
!-0.004
ited-0.003
 unw-0.003
 bit-0.003
Token answered
Token ID9373
Feature activation+2.28e-02

Pos logit contributions
eling+0.189
ited+0.172
!+0.153
 unw+0.134
 transported+0.133
Neg logit contributions
 Their-0.195
 Some-0.185
ily-0.176
itted-0.170
 Everyone-0.169
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
eling+0.026
ited+0.022
!+0.020
 unw+0.016
 transported+0.014
Neg logit contributions
 Their-0.050
 Some-0.049
arl-0.047
ily-0.047
itted-0.046
Token
Token ID220
Feature activation+1.69e-02

Pos logit contributions
!+0.019
eling+0.019
ited+0.017
 unw+0.017
 transported+0.016
Neg logit contributions
arl-0.013
year-0.013
 lesson-0.013
 Their-0.013
ily-0.013
Token
Token ID198
Feature activation+1.45e-03

Pos logit contributions
!+0.029
eling+0.028
ited+0.024
!"+0.023
 unw+0.023
Neg logit contributions
year-0.023
 Some-0.023
 Their-0.022
arl-0.022
oys-0.021
Token
Token ID198
Feature activation+9.30e-04

Pos logit contributions
!+0.002
eling+0.002
!"+0.002
 unw+0.002
ited+0.002
Neg logit contributions
itted-0.002
 Some-0.002
year-0.002
arl-0.002
tr-0.002
Token "
Token ID366
Feature activation+9.21e-03

Pos logit contributions
 Some+0.003
 Their+0.003
ily+0.003
utes+0.003
arl+0.003
Neg logit contributions
!-0.005
eling-0.005
ited-0.005
 unw-0.005
 bit-0.004
TokenIt
Token ID1026
Feature activation-7.73e-03

Pos logit contributions
ited+0.017
eling+0.016
!+0.014
 unw+0.013
 bit+0.013
Neg logit contributions
 Their-0.014
avy-0.013
arl-0.012
 Some-0.012
utes-0.012
Token's
Token ID338
Feature activation-1.99e-03

Pos logit contributions
 Their+0.016
 Some+0.015
ily+0.015
 lesson+0.015
 Everyone+0.014
Neg logit contributions
eling-0.010
ited-0.010
!-0.007
ge-0.007
ivering-0.007
Token okay
Token ID8788
Feature activation+2.79e-02

Pos logit contributions
 Their+0.003
 Some+0.003
ily+0.003
arl+0.003
 lesson+0.002
Neg logit contributions
eling-0.004
ited-0.004
!-0.003
 unw-0.003
 transported-0.003
Token,
Token ID11
Feature activation+6.20e-03

Pos logit contributions
eling+0.045
ited+0.043
 transported+0.031
ivering+0.030
 unw+0.030
Neg logit contributions
 Their-0.051
 Some-0.049
ily-0.048
oys-0.048
 lesson-0.046
Token Lucy
Token ID22162
Feature activation+1.14e-01

Pos logit contributions
eling+0.013
ited+0.012
!+0.011
 unw+0.010
 transported+0.010
Neg logit contributions
 Their-0.008
 Some-0.008
ily-0.007
arl-0.007
 lesson-0.007
Token.
Token ID13
Feature activation+1.15e-02

Pos logit contributions
eling+0.155
ited+0.142
 unw+0.104
 transported+0.100
!+0.098
Neg logit contributions
 Their-0.229
 Some-0.219
ily-0.212
arl-0.207
oys-0.206
Token We
Token ID775
Feature activation+2.07e-02

Pos logit contributions
eling+0.017
!+0.016
ited+0.015
!"+0.013
 unw+0.013
Neg logit contributions
 Their-0.018
 lesson-0.017
year-0.017
utes-0.017
arl-0.017
Token all
Token ID477
Feature activation+4.23e-03

Pos logit contributions
eling+0.037
ited+0.033
!+0.029
 unw+0.027
 transported+0.027
Neg logit contributions
 Their-0.033
 Some-0.033
ily-0.030
 Everyone-0.030
 lesson-0.029
Token make
Token ID787
Feature activation+6.55e-03

Pos logit contributions
eling+0.007
ited+0.006
!+0.005
 unw+0.005
 transported+0.005
Neg logit contributions
 Their-0.007
 Some-0.007
ily-0.006
 Everyone-0.006
 lesson-0.006
Token mistakes
Token ID10135
Feature activation-1.90e-02

Pos logit contributions
eling+0.011
ited+0.010
!+0.009
 unw+0.008
 transported+0.007
Neg logit contributions
 Their-0.011
 Some-0.011
arl-0.010
ily-0.010
itted-0.010
Token that
Token ID326
Feature activation-2.06e-02

Pos logit contributions
 Their+0.001
 Some+0.001
arl+0.001
ily+0.001
itted+0.001
Neg logit contributions
eling-0.001
!-0.001
ited-0.001
 unw-0.001
 bit-0.001
Token a
Token ID257
Feature activation+2.06e-02

Pos logit contributions
 Their+0.033
 Some+0.032
 lesson+0.032
 Everyone+0.030
ily+0.030
Neg logit contributions
eling-0.035
ited-0.033
!-0.028
 unw-0.027
ivering-0.026
Token clever
Token ID14169
Feature activation+1.43e-03

Pos logit contributions
eling+0.039
ited+0.036
!+0.033
 unw+0.030
 transported+0.029
Neg logit contributions
 Their-0.030
 Some-0.029
arl-0.027
ily-0.026
itted-0.026
Token girl
Token ID2576
Feature activation+1.09e-02

Pos logit contributions
eling+0.002
ited+0.002
!+0.002
 unw+0.002
 bit+0.002
Neg logit contributions
 Their-0.002
 Some-0.002
arl-0.002
ily-0.002
itted-0.002
Token like
Token ID588
Feature activation+5.57e-02

Pos logit contributions
eling+0.019
ited+0.017
 unw+0.014
ivering+0.014
!+0.014
Neg logit contributions
 Their-0.018
 Some-0.017
ily-0.016
 Everyone-0.016
 lesson-0.016
Token Lucy
Token ID22162
Feature activation+1.13e-01

Pos logit contributions
eling+0.090
!+0.081
ited+0.079
 unw+0.067
 transported+0.064
Neg logit contributions
 Their-0.093
 Some-0.091
arl-0.088
year-0.085
itted-0.085
Token would
Token ID561
Feature activation-3.73e-03

Pos logit contributions
eling+0.168
ited+0.149
 unw+0.116
 transported+0.112
ivering+0.110
Neg logit contributions
 Their-0.210
 Some-0.200
ily-0.199
 lesson-0.189
itted-0.187
Token never
Token ID1239
Feature activation+1.35e-02

Pos logit contributions
 Their+0.006
 Some+0.005
ily+0.005
 lesson+0.005
 Everyone+0.005
Neg logit contributions
eling-0.007
ited-0.006
!-0.005
ries-0.005
 unw-0.005
Token allow
Token ID1249
Feature activation+3.81e-02

Pos logit contributions
eling+0.019
ited+0.017
!+0.014
ries+0.013
 unw+0.013
Neg logit contributions
 Their-0.025
 Some-0.024
 lesson-0.023
ily-0.023
 Everyone-0.023
Token others
Token ID1854
Feature activation-7.05e-05

Pos logit contributions
eling+0.062
ited+0.059
!+0.049
 unw+0.046
ivering+0.045
Neg logit contributions
 Their-0.064
 Some-0.062
ily-0.059
 Everyone-0.058
 lesson-0.058
Token to
Token ID284
Feature activation+5.58e-03

Pos logit contributions
 Their+0.000
 lesson+0.000
ily+0.000
 Some+0.000
 Everyone+0.000
Neg logit contributions
eling-0.000
ited-0.000
ivering-0.000
 unw-0.000
 transported-0.000
Token
Token ID198
Feature activation+4.45e-05

Pos logit contributions
!+0.027
eling+0.026
!"+0.022
ited+0.022
 unw+0.022
Neg logit contributions
year-0.024
 lesson-0.024
 Their-0.024
 Some-0.024
arl-0.024
Token
Token ID198
Feature activation+2.42e-04

Pos logit contributions
eling+0.000
!+0.000
ited+0.000
 unw+0.000
!"+0.000
Neg logit contributions
tr-0.000
arl-0.000
itted-0.000
oys-0.000
ometer-0.000
TokenWith
Token ID3152
Feature activation+4.17e-02

Pos logit contributions
eling+0.000
ited+0.000
!+0.000
 unw+0.000
ivering+0.000
Neg logit contributions
arl-0.000
 Their-0.000
 Some-0.000
itted-0.000
ometer-0.000
Token that
Token ID326
Feature activation-2.71e-02

Pos logit contributions
eling+0.059
ited+0.053
!+0.053
 unw+0.042
 bit+0.039
Neg logit contributions
arl-0.079
oys-0.078
tr-0.077
 Their-0.076
 Some-0.075
Token,
Token ID11
Feature activation+8.79e-04

Pos logit contributions
 Their+0.057
 Some+0.055
 lesson+0.052
ily+0.052
 Everyone+0.052
Neg logit contributions
eling-0.035
ited-0.031
!-0.025
 unw-0.022
ivering-0.022
Token Lucy
Token ID22162
Feature activation+1.13e-01

Pos logit contributions
eling+0.001
!+0.001
ited+0.001
 unw+0.001
 transported+0.001
Neg logit contributions
 Their-0.002
arl-0.002
 Some-0.001
itted-0.001
year-0.001
Token and
Token ID290
Feature activation-1.04e-03

Pos logit contributions
eling+0.185
ited+0.165
!+0.145
ivering+0.132
ge+0.130
Neg logit contributions
 Their-0.188
 Some-0.180
ily-0.172
itted-0.171
 Everyone-0.166
Token her
Token ID607
Feature activation+2.27e-02

Pos logit contributions
 Some+0.002
arl+0.002
 Their+0.002
itted+0.002
year+0.002
Neg logit contributions
eling-0.001
!-0.001
ited-0.001
 unw-0.001
 struggling-0.001
Token mom
Token ID1995
Feature activation+9.18e-03

Pos logit contributions
eling+0.045
ited+0.042
!+0.041
 unw+0.037
 transported+0.035
Neg logit contributions
 Their-0.030
 Some-0.029
arl-0.027
year-0.026
ily-0.026
Token left
Token ID1364
Feature activation-1.31e-02

Pos logit contributions
eling+0.014
ited+0.012
!+0.011
 unw+0.010
 transported+0.009
Neg logit contributions
 Their-0.017
 Some-0.017
arl-0.016
ily-0.016
itted-0.015
Token the
Token ID262
Feature activation+1.42e-02

Pos logit contributions
 Their+0.021
 Some+0.020
ily+0.019
 many+0.019
 lesson+0.018
Neg logit contributions
eling-0.023
ited-0.022
!-0.018
ivering-0.017
ge-0.017
Token't
Token ID470
Feature activation+3.44e-02

Pos logit contributions
ited+0.055
eling+0.054
 bas+0.043
 unw+0.043
ge+0.042
Neg logit contributions
 Everyone-0.026
 Their-0.026
 Some-0.026
ily-0.023
arl-0.023
Token move
Token ID1445
Feature activation+2.38e-02

Pos logit contributions
eling+0.063
ited+0.057
ries+0.048
 transported+0.048
 bas+0.044
Neg logit contributions
 Everyone-0.049
 Their-0.049
arl-0.048
 lesson-0.048
ot-0.047
Token,
Token ID11
Feature activation+7.61e-03

Pos logit contributions
eling+0.032
ited+0.029
!+0.022
 unw+0.022
 transported+0.021
Neg logit contributions
 Their-0.047
 Some-0.046
ily-0.044
arl-0.044
 lesson-0.043
Token Tom
Token ID4186
Feature activation+1.77e-02

Pos logit contributions
eling+0.014
!+0.013
ited+0.012
 unw+0.011
!"+0.011
Neg logit contributions
 Their-0.011
 Some-0.011
itted-0.011
year-0.010
arl-0.010
Token,"
Token ID553
Feature activation-4.58e-03

Pos logit contributions
eling+0.029
ited+0.027
 transported+0.022
 unw+0.022
 emerge+0.021
Neg logit contributions
 Their-0.029
ily-0.029
 Some-0.027
oys-0.027
 lesson-0.027
Token Lucy
Token ID22162
Feature activation+1.13e-01

Pos logit contributions
arl+0.008
itted+0.008
utes+0.007
avy+0.007
year+0.007
Neg logit contributions
eling-0.007
!-0.007
ited-0.006
 unw-0.006
 bit-0.006
Token mentioned
Token ID4750
Feature activation+1.23e-02

Pos logit contributions
eling+0.206
ited+0.194
!+0.165
ivering+0.164
ge+0.163
Neg logit contributions
 Their-0.173
 Some-0.157
ily-0.153
 Everyone-0.145
 His-0.141
Token.
Token ID13
Feature activation+9.41e-03

Pos logit contributions
eling+0.013
!+0.011
ited+0.011
 unw+0.008
ge+0.007
Neg logit contributions
 Their-0.028
 Some-0.027
arl-0.026
itted-0.026
year-0.026
Token "
Token ID366
Feature activation+1.02e-02

Pos logit contributions
eling+0.017
!+0.017
ited+0.016
 unw+0.015
!"+0.014
Neg logit contributions
year-0.011
arl-0.011
 lesson-0.011
avy-0.011
ily-0.011
TokenI
Token ID40
Feature activation+2.87e-02

Pos logit contributions
eling+0.015
ited+0.015
ge+0.012
!+0.012
 unw+0.012
Neg logit contributions
arl-0.017
 Their-0.017
avy-0.015
oys-0.015
 lesson-0.014
Token can
Token ID460
Feature activation-1.31e-02

Pos logit contributions
eling+0.051
ited+0.045
ries+0.039
!+0.039
ivering+0.037
Neg logit contributions
 Their-0.046
 Some-0.045
 lesson-0.042
ily-0.041
arl-0.040
Token looked
Token ID3114
Feature activation-1.37e-02

Pos logit contributions
 Their+0.002
 Some+0.002
ily+0.002
 Everyone+0.001
year+0.001
Neg logit contributions
eling-0.001
ited-0.001
!-0.001
ge-0.001
ivering-0.001
Token sad
Token ID6507
Feature activation-2.36e-02

Pos logit contributions
 Their+0.020
 Some+0.019
 many+0.018
ily+0.017
year+0.017
Neg logit contributions
eling-0.026
ited-0.024
ge-0.021
 unw-0.020
ries-0.020
Token and
Token ID290
Feature activation-1.17e-02

Pos logit contributions
 Their+0.054
 Some+0.053
arl+0.051
year+0.051
itted+0.050
Neg logit contributions
eling-0.023
!-0.021
ited-0.020
 unw-0.013
!"-0.013
Token lonely
Token ID21757
Feature activation-8.87e-03

Pos logit contributions
 Their+0.021
 Some+0.020
ily+0.019
arl+0.018
year+0.018
Neg logit contributions
eling-0.018
ited-0.017
!-0.015
 unw-0.013
ivering-0.013
Token.
Token ID13
Feature activation+6.25e-03

Pos logit contributions
 Their+0.020
 Some+0.019
arl+0.019
ily+0.018
itted+0.018
Neg logit contributions
eling-0.010
ited-0.008
!-0.007
 unw-0.006
 transported-0.005
Token Lucy
Token ID22162
Feature activation+1.13e-01

Pos logit contributions
eling+0.012
!+0.011
ited+0.010
 unw+0.010
!"+0.009
Neg logit contributions
arl-0.007
utes-0.007
 lesson-0.007
oys-0.007
year-0.007
Token wanted
Token ID2227
Feature activation-5.01e-04

Pos logit contributions
eling+0.213
ited+0.197
ries+0.171
ge+0.167
!+0.161
Neg logit contributions
 Their-0.159
ily-0.144
 Some-0.139
 His-0.129
 many-0.129
Token to
Token ID284
Feature activation+1.92e-04

Pos logit contributions
arl+0.001
 Their+0.001
itted+0.001
utes+0.001
year+0.001
Neg logit contributions
eling-0.001
!-0.001
ited-0.001
 unw-0.001
 bit-0.001
Token make
Token ID787
Feature activation+4.30e-04

Pos logit contributions
eling+0.000
ited+0.000
ivering+0.000
ries+0.000
!+0.000
Neg logit contributions
 Their-0.000
 Everyone-0.000
 Some-0.000
ily-0.000
 lesson-0.000
Token him
Token ID683
Feature activation-8.50e-03

Pos logit contributions
eling+0.001
ited+0.001
!+0.001
 unw+0.001
 transported+0.001
Neg logit contributions
 Their-0.001
 Some-0.001
arl-0.001
year-0.001
itted-0.001
Token feel
Token ID1254
Feature activation+4.04e-02

Pos logit contributions
 Their+0.013
 Some+0.013
 lesson+0.012
 many+0.012
ily+0.012
Neg logit contributions
eling-0.016
ited-0.015
!-0.012
 unw-0.011
ivering-0.011
Token "
Token ID366
Feature activation+1.84e-02

Pos logit contributions
!+0.017
eling+0.016
 unw+0.014
ited+0.014
!"+0.013
Neg logit contributions
ests-0.011
avy-0.011
 Some-0.011
itted-0.011
ily-0.011
TokenLet
Token ID5756
Feature activation+3.57e-03

Pos logit contributions
eling+0.033
ited+0.033
!+0.027
 unw+0.025
ivering+0.024
Neg logit contributions
 Their-0.028
 Some-0.026
avy-0.026
arl-0.025
itted-0.025
Token's
Token ID338
Feature activation+1.77e-03

Pos logit contributions
eling+0.005
ited+0.004
ge+0.003
ivering+0.003
 transported+0.003
Neg logit contributions
 Their-0.007
 Everyone-0.007
ily-0.007
 Some-0.007
 lesson-0.007
Token compromise
Token ID13110
Feature activation-1.04e-03

Pos logit contributions
eling+0.003
ited+0.003
!+0.002
ivering+0.002
ge+0.002
Neg logit contributions
 Their-0.003
 Some-0.003
 Everyone-0.003
ily-0.003
arl-0.003
Token.
Token ID13
Feature activation+1.47e-03

Pos logit contributions
 Their+0.002
 Some+0.002
ily+0.002
oys+0.002
 lesson+0.002
Neg logit contributions
eling-0.002
ited-0.001
!-0.001
 unw-0.001
 transported-0.001
Token Lucy
Token ID22162
Feature activation+1.13e-01

Pos logit contributions
!+0.002
eling+0.002
!"+0.002
 unw+0.001
ited+0.001
Neg logit contributions
avy-0.003
itted-0.003
 lesson-0.003
year-0.003
arl-0.003
Token can
Token ID460
Feature activation-1.62e-02

Pos logit contributions
eling+0.196
ited+0.184
ivering+0.148
ge+0.148
 transported+0.147
Neg logit contributions
 Their-0.176
 Some-0.171
 Now-0.162
 Everyone-0.161
ily-0.160
Token play
Token ID711
Feature activation-4.14e-04

Pos logit contributions
 Their+0.025
 Some+0.024
 Everyone+0.024
ily+0.023
 lesson+0.021
Neg logit contributions
eling-0.029
ited-0.028
!-0.022
 transported-0.022
ivering-0.022
Token on
Token ID319
Feature activation-1.80e-02

Pos logit contributions
 Their+0.001
 Some+0.001
ily+0.001
 Everyone+0.001
 lesson+0.001
Neg logit contributions
eling-0.001
ited-0.001
!-0.001
 unw-0.000
 transported-0.000
Token the
Token ID262
Feature activation+9.07e-03

Pos logit contributions
 Their+0.028
 Some+0.028
 many+0.026
ily+0.026
 Everyone+0.026
Neg logit contributions
eling-0.032
ited-0.030
!-0.024
ivering-0.024
 transported-0.024
Token rail
Token ID6787
Feature activation-7.33e-03

Pos logit contributions
eling+0.015
ited+0.013
!+0.012
 unw+0.011
 transported+0.010
Neg logit contributions
 Their-0.016
 Some-0.015
arl-0.014
ily-0.014
itted-0.014
Token Fin
Token ID4463
Feature activation+1.54e-02

Pos logit contributions
eling+0.027
ited+0.025
!+0.023
 unw+0.019
 transported+0.019
Neg logit contributions
 Their-0.032
 Some-0.031
arl-0.030
ily-0.029
itted-0.029
Token was
Token ID373
Feature activation+9.88e-03

Pos logit contributions
eling+0.024
ited+0.021
ries+0.017
ivering+0.017
 transported+0.017
Neg logit contributions
 Their-0.028
ily-0.026
 Some-0.025
 lesson-0.024
 Everyone-0.024
Token very
Token ID845
Feature activation+3.38e-03

Pos logit contributions
eling+0.017
ited+0.016
!+0.014
 unw+0.013
ivering+0.012
Neg logit contributions
 Their-0.016
 Some-0.016
ily-0.015
arl-0.014
year-0.014
Token happy
Token ID3772
Feature activation-1.08e-03

Pos logit contributions
eling+0.006
ited+0.005
!+0.005
 unw+0.004
 transported+0.004
Neg logit contributions
 Their-0.005
 Some-0.005
arl-0.005
ily-0.005
itted-0.005
Token.
Token ID13
Feature activation+9.80e-03

Pos logit contributions
 Their+0.002
 Some+0.002
ily+0.002
 lesson+0.002
 Everyone+0.002
Neg logit contributions
eling-0.002
ited-0.002
 unw-0.001
ivering-0.001
!-0.001
Token Lucy
Token ID22162
Feature activation+1.13e-01

Pos logit contributions
!+0.019
eling+0.017
!"+0.016
ited+0.016
!".+0.015
Neg logit contributions
year-0.012
itted-0.011
arl-0.011
 lesson-0.011
ily-0.011
Token felt
Token ID2936
Feature activation-2.55e-02

Pos logit contributions
eling+0.210
ited+0.197
ries+0.163
ivering+0.162
ge+0.160
Neg logit contributions
 Their-0.168
 Some-0.146
ily-0.144
 Everyone-0.143
 Now-0.141
Token like
Token ID588
Feature activation+5.11e-02

Pos logit contributions
 Their+0.037
 Some+0.035
ily+0.034
 Everyone+0.033
year+0.032
Neg logit contributions
eling-0.049
ited-0.046
!-0.041
ge-0.039
ries-0.039
Token she
Token ID673
Feature activation+3.14e-02

Pos logit contributions
eling+0.068
!+0.066
ited+0.055
 unw+0.050
 transported+0.050
Neg logit contributions
arl-0.097
 Their-0.095
tr-0.095
 Some-0.093
itted-0.089
Token had
Token ID550
Feature activation-1.61e-02

Pos logit contributions
eling+0.059
ited+0.055
ries+0.050
ge+0.047
ivering+0.046
Neg logit contributions
 Their-0.041
ily-0.041
 Some-0.038
year-0.038
 Everyone-0.038
Token rescued
Token ID19868
Feature activation+7.30e-02

Pos logit contributions
 Their+0.027
 Some+0.026
ily+0.025
itted+0.024
 Everyone+0.023
Neg logit contributions
eling-0.027
ited-0.025
!-0.022
ivering-0.020
 unw-0.020
Token let
Token ID1309
Feature activation-6.43e-03

Pos logit contributions
eling+0.005
ited+0.005
!+0.004
ge+0.004
 unw+0.004
Neg logit contributions
 Their-0.006
 Some-0.006
ily-0.005
year-0.005
 Everyone-0.005
Token out
Token ID503
Feature activation+1.84e-02

Pos logit contributions
 Their+0.011
 Some+0.010
ily+0.010
 Everyone+0.010
arl+0.010
Neg logit contributions
eling-0.011
ited-0.010
!-0.009
 unw-0.008
ivering-0.008
Token a
Token ID257
Feature activation+5.06e-03

Pos logit contributions
eling+0.032
!+0.031
ited+0.029
 unw+0.026
!"+0.025
Neg logit contributions
arl-0.027
 Their-0.026
utes-0.026
 Some-0.025
itted-0.025
Token big
Token ID1263
Feature activation+1.79e-02

Pos logit contributions
eling+0.009
ited+0.008
!+0.008
 unw+0.007
 bit+0.006
Neg logit contributions
 Their-0.008
 Some-0.008
ily-0.008
arl-0.007
itted-0.007
Token sn
Token ID3013
Feature activation-1.35e-02

Pos logit contributions
eling+0.027
ited+0.023
ge+0.018
!+0.018
 transported+0.018
Neg logit contributions
 Their-0.033
 Some-0.032
arl-0.031
 lesson-0.030
ily-0.030
Tokenarl
Token ID7063
Feature activation-2.01e-01

Pos logit contributions
 Their+0.022
 Some+0.021
 lesson+0.019
itted+0.018
utes+0.018
Neg logit contributions
eling-0.023
ited-0.022
!-0.021
ivering-0.018
 unw-0.017
Token and
Token ID290
Feature activation-1.14e-02

Pos logit contributions
 Some+0.451
 Their+0.444
year+0.435
ily+0.416
 Many+0.403
Neg logit contributions
!-0.197
eling-0.175
ited-0.154
!"-0.127
 unw-0.126
Token the
Token ID262
Feature activation-4.08e-04

Pos logit contributions
 Their+0.021
 Some+0.020
ily+0.019
year+0.018
arl+0.018
Neg logit contributions
eling-0.017
ited-0.016
!-0.014
 unw-0.012
ivering-0.012
Token mouse
Token ID10211
Feature activation+1.54e-03

Pos logit contributions
 Their+0.001
 Some+0.001
arl+0.001
utes+0.001
year+0.001
Neg logit contributions
eling-0.001
ited-0.001
!-0.001
 bit-0.001
 unw-0.000
Token knew
Token ID2993
Feature activation-6.24e-02

Pos logit contributions
eling+0.003
ited+0.002
ivering+0.002
ries+0.002
ge+0.002
Neg logit contributions
 Their-0.003
 Some-0.002
ily-0.002
 lesson-0.002
 His-0.002
Token he
Token ID339
Feature activation+1.17e-02

Pos logit contributions
 Their+0.101
 Some+0.094
ily+0.094
 lesson+0.089
year+0.087
Neg logit contributions
eling-0.107
ited-0.105
!-0.083
ivering-0.081
 unw-0.078
Token"
Token ID1
Feature activation+1.47e-02

Pos logit contributions
 Their+0.019
 Some+0.018
itted+0.017
arl+0.017
ily+0.017
Neg logit contributions
eling-0.018
ited-0.017
!-0.016
 unw-0.014
ivering-0.013
TokenHello
Token ID15496
Feature activation-2.10e-03

Pos logit contributions
ited+0.021
eling+0.021
 unw+0.017
!+0.017
 bit+0.015
Neg logit contributions
 Their-0.026
ometer-0.026
avy-0.025
utes-0.024
inter-0.023
Token,
Token ID11
Feature activation-1.59e-03

Pos logit contributions
 Their+0.003
 Some+0.003
ily+0.003
 many+0.003
oys+0.003
Neg logit contributions
eling-0.004
ited-0.003
ivering-0.003
 transported-0.003
cking-0.003
Token my
Token ID616
Feature activation-2.03e-02

Pos logit contributions
 Their+0.002
 Some+0.002
arl+0.002
ily+0.002
itted+0.002
Neg logit contributions
eling-0.003
ited-0.003
!-0.003
 unw-0.002
 transported-0.002
Token d
Token ID288
Feature activation-5.22e-02

Pos logit contributions
 Their+0.027
 Some+0.026
arl+0.023
ily+0.023
 Everyone+0.023
Neg logit contributions
eling-0.042
ited-0.039
!-0.034
 unw-0.032
 transported-0.032
Tokenarl
Token ID7063
Feature activation-1.97e-01

Pos logit contributions
 Their+0.035
 lesson+0.035
 Some+0.030
 many+0.024
 His+0.019
Neg logit contributions
eling-0.130
!-0.128
ited-0.125
ivering-0.121
!"-0.114
Tokenings
Token ID654
Feature activation+7.39e-02

Pos logit contributions
 Their+0.416
 Some+0.408
ily+0.386
year+0.385
itted+0.371
Neg logit contributions
eling-0.238
ited-0.209
!-0.198
 unw-0.154
!"-0.144
Token.
Token ID13
Feature activation+1.17e-02

Pos logit contributions
eling+0.083
ited+0.072
!+0.060
 unw+0.050
 transported+0.047
Neg logit contributions
 Their-0.164
 Some-0.160
ily-0.153
arl-0.153
itted-0.149
Token I
Token ID314
Feature activation+1.20e-02

Pos logit contributions
!+0.016
eling+0.016
ited+0.014
!"+0.014
 unw+0.013
Neg logit contributions
 lesson-0.019
utes-0.018
itted-0.018
avy-0.018
 Their-0.017
Token'm
Token ID1101
Feature activation+5.16e-02

Pos logit contributions
eling+0.025
ited+0.022
ries+0.021
ivering+0.020
!+0.019
Neg logit contributions
 Some-0.014
 Their-0.014
ily-0.014
year-0.014
 lesson-0.013
Token so
Token ID523
Feature activation-2.09e-02

Pos logit contributions
eling+0.087
ited+0.085
!+0.068
ge+0.064
 unw+0.062
Neg logit contributions
 Their-0.089
 Some-0.084
ily-0.082
year-0.078
 Everyone-0.078
TokenWhat
Token ID2061
Feature activation+3.46e-03

Pos logit contributions
eling+0.017
ited+0.017
!+0.014
 unw+0.013
ivering+0.013
Neg logit contributions
 Their-0.014
tr-0.012
 Some-0.012
avy-0.012
utes-0.012
Token happened
Token ID3022
Feature activation+2.68e-02

Pos logit contributions
eling+0.006
ited+0.005
!+0.005
 unw+0.004
ivering+0.004
Neg logit contributions
 Their-0.006
 Some-0.006
ily-0.005
 lesson-0.005
arl-0.005
Token,
Token ID11
Feature activation+1.04e-02

Pos logit contributions
eling+0.036
ited+0.033
!+0.025
 unw+0.024
 transported+0.022
Neg logit contributions
 Their-0.054
 Some-0.052
ily-0.051
arl-0.050
oys-0.049
Token my
Token ID616
Feature activation-1.78e-02

Pos logit contributions
eling+0.022
ited+0.021
!+0.019
 unw+0.018
 transported+0.017
Neg logit contributions
 Their-0.013
 Some-0.012
arl-0.011
ily-0.011
itted-0.011
Token d
Token ID288
Feature activation-5.00e-02

Pos logit contributions
 Their+0.026
 Some+0.025
arl+0.023
ily+0.023
itted+0.022
Neg logit contributions
eling-0.034
ited-0.032
!-0.029
 unw-0.026
 transported-0.025
Tokenarl
Token ID7063
Feature activation-1.97e-01

Pos logit contributions
 Their+0.043
 lesson+0.040
 Some+0.032
 many+0.029
 songs+0.028
Neg logit contributions
eling-0.119
!-0.116
ivering-0.116
ited-0.113
!"-0.101
Tokenings
Token ID654
Feature activation+7.88e-02

Pos logit contributions
 Their+0.450
 Some+0.446
year+0.441
ily+0.422
 Many+0.402
Neg logit contributions
eling-0.187
!-0.172
ited-0.162
!"-0.118
 unw-0.111
Token?
Token ID30
Feature activation+3.38e-02

Pos logit contributions
eling+0.053
!+0.045
ited+0.040
!"+0.022
ivering+0.018
Neg logit contributions
 Their-0.205
 Some-0.203
itted-0.196
year-0.194
utes-0.191
Token Why
Token ID4162
Feature activation-3.73e-03

Pos logit contributions
eling+0.058
!+0.055
ited+0.051
 unw+0.048
 transported+0.047
Neg logit contributions
 lesson-0.047
utes-0.042
Years-0.041
year-0.041
itted-0.041
Token do
Token ID466
Feature activation-4.63e-02

Pos logit contributions
 Their+0.006
 Some+0.005
ily+0.005
 lesson+0.005
 Everyone+0.005
Neg logit contributions
eling-0.007
ited-0.007
!-0.006
ivering-0.005
 unw-0.005
Token you
Token ID345
Feature activation-3.76e-02

Pos logit contributions
 Their+0.100
 Some+0.099
year+0.098
arl+0.097
itted+0.097
Neg logit contributions
eling-0.054
!-0.043
ited-0.039
 unw-0.031
!"-0.030
Token perfect
Token ID2818
Feature activation+5.05e-02

Pos logit contributions
ited+0.112
eling+0.108
ge+0.091
ivering+0.086
!+0.083
Neg logit contributions
 Their-0.062
ily-0.058
 Some-0.055
 many-0.054
 Everyone-0.052
Token fish
Token ID5916
Feature activation+1.06e-02

Pos logit contributions
eling+0.071
ited+0.067
 unw+0.044
 transported+0.044
ivering+0.044
Neg logit contributions
 Their-0.095
ily-0.095
 Some-0.094
oys-0.093
arl-0.092
Token,
Token ID11
Feature activation+6.97e-03

Pos logit contributions
eling+0.016
ited+0.014
 unw+0.011
 transported+0.010
ivering+0.010
Neg logit contributions
 Their-0.020
ily-0.020
oys-0.019
 Some-0.019
 lesson-0.018
Token my
Token ID616
Feature activation-1.94e-02

Pos logit contributions
eling+0.011
!+0.011
ited+0.010
 mess+0.010
!"+0.010
Neg logit contributions
year-0.010
itted-0.010
 Their-0.010
 Some-0.010
avy-0.010
Token d
Token ID288
Feature activation-5.29e-02

Pos logit contributions
 Their+0.027
 Some+0.026
ily+0.024
arl+0.024
 lesson+0.024
Neg logit contributions
eling-0.038
ited-0.036
!-0.031
 unw-0.029
ivering-0.029
Tokenarl
Token ID7063
Feature activation-1.96e-01

Pos logit contributions
 Their+0.061
 Some+0.056
 lesson+0.056
tr+0.049
year+0.046
Neg logit contributions
eling-0.112
!-0.106
ited-0.106
ivering-0.094
!"-0.092
Tokenings
Token ID654
Feature activation+7.75e-02

Pos logit contributions
 Their+0.403
 Some+0.394
ily+0.373
year+0.369
arl+0.365
Neg logit contributions
eling-0.248
ited-0.220
!-0.202
 unw-0.162
 transported-0.154
Token,"
Token ID553
Feature activation-7.62e-03

Pos logit contributions
eling+0.083
ited+0.074
!+0.069
 unw+0.052
!"+0.048
Neg logit contributions
 Their-0.173
 Some-0.168
arl-0.159
itted-0.159
ily-0.158
Token mom
Token ID1995
Feature activation-1.88e-03

Pos logit contributions
year+0.012
arl+0.011
itted+0.011
avy+0.010
 Some+0.010
Neg logit contributions
eling-0.013
!-0.012
ited-0.011
 unw-0.011
 mess-0.010
Token says
Token ID1139
Feature activation+1.03e-02

Pos logit contributions
 Their+0.003
 Some+0.003
ily+0.003
arl+0.002
 Everyone+0.002
Neg logit contributions
eling-0.004
ited-0.004
!-0.003
ge-0.003
 transported-0.003
Token.
Token ID13
Feature activation+6.73e-03

Pos logit contributions
eling+0.014
ited+0.013
!+0.010
 transported+0.009
 unw+0.009
Neg logit contributions
 Their-0.021
 Some-0.021
ily-0.020
arl-0.020
itted-0.019
Token"
Token ID1
Feature activation+1.25e-02

Pos logit contributions
 Their+0.010
 Some+0.010
itted+0.009
arl+0.009
ily+0.009
Neg logit contributions
eling-0.014
ited-0.013
!-0.012
 unw-0.011
ivering-0.010
TokenHello
Token ID15496
Feature activation-7.37e-04

Pos logit contributions
eling+0.019
ited+0.019
!+0.016
 unw+0.016
 transported+0.015
Neg logit contributions
ometer-0.020
 Their-0.020
avy-0.019
utes-0.019
itted-0.018
Token,
Token ID11
Feature activation-2.57e-03

Pos logit contributions
 Their+0.001
 many+0.001
 Some+0.001
ily+0.001
 Everyone+0.001
Neg logit contributions
eling-0.002
ited-0.001
ivering-0.001
itching-0.001
 choke-0.001
Token my
Token ID616
Feature activation-2.13e-02

Pos logit contributions
 Their+0.004
 Some+0.004
arl+0.003
itted+0.003
ily+0.003
Neg logit contributions
eling-0.005
ited-0.004
!-0.004
 unw-0.004
 transported-0.004
Token d
Token ID288
Feature activation-5.36e-02

Pos logit contributions
 Their+0.031
 Some+0.029
arl+0.027
ily+0.027
itted+0.026
Neg logit contributions
eling-0.041
ited-0.038
!-0.035
 unw-0.031
 transported-0.030
Tokenarl
Token ID7063
Feature activation-1.96e-01

Pos logit contributions
 Their+0.035
 Some+0.029
 lesson+0.029
 many+0.020
 His+0.019
Neg logit contributions
eling-0.139
!-0.135
ited-0.133
ivering-0.123
!"-0.120
Tokenings
Token ID654
Feature activation+7.62e-02

Pos logit contributions
 Their+0.408
 Some+0.399
ily+0.377
year+0.375
arl+0.366
Neg logit contributions
eling-0.243
ited-0.214
!-0.199
 unw-0.157
 transported-0.149
Token.
Token ID13
Feature activation+1.61e-02

Pos logit contributions
eling+0.092
ited+0.080
 transported+0.057
!+0.055
 unw+0.055
Neg logit contributions
 Their-0.164
 Some-0.160
ily-0.158
arl-0.155
 Everyone-0.152
Token I
Token ID314
Feature activation+1.21e-02

Pos logit contributions
!+0.021
eling+0.020
!"+0.018
ited+0.018
 transported+0.017
Neg logit contributions
 lesson-0.027
utes-0.026
 Their-0.026
itted-0.026
arl-0.026
Token'm
Token ID1101
Feature activation+5.29e-02

Pos logit contributions
eling+0.025
ited+0.022
ries+0.021
ivering+0.019
!+0.019
Neg logit contributions
 Their-0.015
ily-0.014
 Some-0.014
year-0.013
 lesson-0.013
Token so
Token ID523
Feature activation-2.10e-02

Pos logit contributions
eling+0.086
ited+0.083
!+0.067
ge+0.062
 unw+0.061
Neg logit contributions
 Their-0.094
 Some-0.089
ily-0.087
arl-0.083
 Everyone-0.082
Token you
Token ID345
Feature activation-3.85e-02

Pos logit contributions
eling+0.004
ited+0.004
!+0.004
 unw+0.003
 transported+0.003
Neg logit contributions
 Their-0.006
 Some-0.006
ily-0.006
arl-0.006
itted-0.006
Token okay
Token ID8788
Feature activation+1.72e-02

Pos logit contributions
 Their+0.042
 Some+0.039
ily+0.038
arl+0.036
itted+0.035
Neg logit contributions
eling-0.086
ited-0.081
!-0.070
 unw-0.070
ge-0.070
Token,
Token ID11
Feature activation+5.81e-03

Pos logit contributions
eling+0.030
ited+0.029
 unw+0.020
 transported+0.020
hole+0.020
Neg logit contributions
 Their-0.031
arl-0.028
 Some-0.028
ily-0.028
oys-0.027
Token my
Token ID616
Feature activation-1.31e-02

Pos logit contributions
eling+0.012
ited+0.011
!+0.011
 unw+0.010
ivering+0.009
Neg logit contributions
 Their-0.007
 Some-0.007
itted-0.006
year-0.006
arl-0.006
Token d
Token ID288
Feature activation-4.78e-02

Pos logit contributions
 Their+0.019
 Some+0.019
ily+0.017
arl+0.017
itted+0.017
Neg logit contributions
eling-0.024
ited-0.023
!-0.021
 unw-0.019
ivering-0.018
Tokenarl
Token ID7063
Feature activation-1.95e-01

Pos logit contributions
 Their+0.035
 lesson+0.034
 Some+0.031
 many+0.026
year+0.023
Neg logit contributions
eling-0.118
!-0.114
ivering-0.114
ited-0.113
!"-0.100
Tokenings
Token ID654
Feature activation+7.80e-02

Pos logit contributions
 Their+0.414
 Some+0.407
year+0.388
ily+0.385
 Everyone+0.369
Neg logit contributions
eling-0.232
ited-0.205
!-0.194
 unw-0.148
!"-0.140
Token?"
Token ID1701
Feature activation+2.23e-02

Pos logit contributions
eling+0.067
ited+0.058
!+0.058
!"+0.036
 unw+0.034
Neg logit contributions
 Their-0.190
 Some-0.187
itted-0.180
year-0.177
utes-0.176
Token she
Token ID673
Feature activation+2.94e-02

Pos logit contributions
eling+0.036
ited+0.032
!+0.032
 unw+0.028
 transported+0.026
Neg logit contributions
arl-0.034
itted-0.034
 Their-0.034
year-0.033
 Some-0.033
Token asked
Token ID1965
Feature activation+2.98e-02

Pos logit contributions
eling+0.041
ited+0.037
!+0.033
 unw+0.028
 transported+0.027
Neg logit contributions
 Their-0.057
 Some-0.056
arl-0.053
ily-0.052
itted-0.052
Token.
Token ID13
Feature activation+5.02e-03

Pos logit contributions
eling+0.034
ited+0.030
!+0.025
 unw+0.021
 transported+0.020
Neg logit contributions
 Their-0.066
 Some-0.064
ily-0.061
arl-0.061
itted-0.060
Token you
Token ID345
Feature activation-4.03e-02

Pos logit contributions
ily+0.047
 Their+0.046
 Everyone+0.043
 many+0.040
 Some+0.040
Neg logit contributions
ited-0.083
eling-0.072
ge-0.069
ivering-0.066
hole-0.059
Token go
Token ID467
Feature activation+5.49e-03

Pos logit contributions
 Their+0.069
 Some+0.067
ily+0.063
arl+0.063
 Everyone+0.061
Neg logit contributions
eling-0.067
ited-0.061
!-0.055
 unw-0.048
 transported-0.047
Token,
Token ID11
Feature activation-8.65e-05

Pos logit contributions
eling+0.009
ited+0.008
 transported+0.006
 unw+0.006
ivering+0.005
Neg logit contributions
 Their-0.010
ily-0.010
 Some-0.010
arl-0.010
oys-0.009
Token my
Token ID616
Feature activation-1.36e-02

Pos logit contributions
 Their+0.000
 Some+0.000
ily+0.000
arl+0.000
oys+0.000
Neg logit contributions
eling-0.000
ited-0.000
!-0.000
 unw-0.000
 transported-0.000
Token d
Token ID288
Feature activation-5.24e-02

Pos logit contributions
 Their+0.019
 Some+0.018
year+0.017
ily+0.016
itted+0.016
Neg logit contributions
eling-0.025
!-0.024
ited-0.024
!"-0.020
 unw-0.020
Tokenarl
Token ID7063
Feature activation-1.95e-01

Pos logit contributions
 Their+0.040
 lesson+0.040
 Some+0.032
 many+0.028
 songs+0.026
Neg logit contributions
eling-0.125
!-0.124
ivering-0.121
ited-0.119
!"-0.109
Tokenings
Token ID654
Feature activation+7.80e-02

Pos logit contributions
 Their+0.425
 Some+0.418
year+0.406
ily+0.394
 Everyone+0.378
Neg logit contributions
eling-0.213
ited-0.188
!-0.187
 unw-0.135
!"-0.135
Token,"
Token ID553
Feature activation+1.66e-03

Pos logit contributions
eling+0.089
ited+0.078
!+0.062
 unw+0.054
 transported+0.052
Neg logit contributions
 Their-0.172
 Some-0.168
ily-0.162
arl-0.161
 Everyone-0.157
Token Mom
Token ID11254
Feature activation+1.63e-02

Pos logit contributions
eling+0.002
!+0.002
ited+0.002
 unw+0.002
 mess+0.002
Neg logit contributions
arl-0.003
itted-0.003
year-0.003
utes-0.003
tr-0.003
Token said
Token ID531
Feature activation+7.93e-03

Pos logit contributions
eling+0.026
ited+0.024
!+0.022
 unw+0.019
 transported+0.018
Neg logit contributions
 Their-0.028
 Some-0.027
ily-0.026
arl-0.025
itted-0.025
Token.
Token ID13
Feature activation+8.20e-03

Pos logit contributions
eling+0.011
ited+0.010
!+0.007
ivering+0.006
 transported+0.006
Neg logit contributions
 Their-0.017
 Some-0.016
ily-0.016
arl-0.015
itted-0.015
Token to
Token ID284
Feature activation-3.95e-03

Pos logit contributions
 Their+0.021
 Some+0.020
ily+0.018
arl+0.018
itted+0.018
Neg logit contributions
eling-0.038
ited-0.036
!-0.033
 unw-0.030
ivering-0.030
Token grow
Token ID1663
Feature activation+4.76e-02

Pos logit contributions
 Their+0.007
 Some+0.006
 lesson+0.006
 Everyone+0.006
arl+0.006
Neg logit contributions
eling-0.006
ited-0.006
!-0.005
ries-0.004
 transported-0.004
Tokenl
Token ID75
Feature activation-6.81e-02

Pos logit contributions
eling+0.075
ited+0.070
!+0.057
 unw+0.055
ge+0.052
Neg logit contributions
 Their-0.084
 Some-0.082
ily-0.077
 lesson-0.075
year-0.075
Token and
Token ID290
Feature activation-1.68e-02

Pos logit contributions
 Their+0.137
 Some+0.132
ily+0.128
arl+0.127
itted+0.126
Neg logit contributions
eling-0.092
ited-0.081
!-0.066
 unw-0.059
 transported-0.057
Token sn
Token ID3013
Feature activation-9.11e-03

Pos logit contributions
 Their+0.033
 Some+0.033
arl+0.031
ily+0.031
itted+0.031
Neg logit contributions
eling-0.023
ited-0.020
!-0.018
 unw-0.015
 transported-0.014
Tokenarl
Token ID7063
Feature activation-1.95e-01

Pos logit contributions
 Their+0.011
 Some+0.011
ily+0.010
 lesson+0.010
oys+0.010
Neg logit contributions
eling-0.019
ited-0.018
!-0.017
ivering-0.015
 unw-0.014
Token to
Token ID284
Feature activation-3.29e-03

Pos logit contributions
year+0.458
 Some+0.454
 Their+0.435
ily+0.429
 Many+0.423
Neg logit contributions
!-0.172
eling-0.155
!"-0.111
ited-0.107
ge-0.094
Token try
Token ID1949
Feature activation-1.67e-02

Pos logit contributions
itted+0.006
 Their+0.006
arl+0.006
oys+0.006
utes+0.006
Neg logit contributions
eling-0.005
!-0.004
ited-0.004
 unw-0.003
 bit-0.003
Token and
Token ID290
Feature activation-1.26e-02

Pos logit contributions
 Their+0.028
 Some+0.027
arl+0.026
ily+0.026
itted+0.025
Neg logit contributions
eling-0.027
ited-0.024
!-0.024
 unw-0.020
 transported-0.019
Token scare
Token ID19437
Feature activation+1.96e-02

Pos logit contributions
 Their+0.024
 Some+0.023
ily+0.021
arl+0.021
 Everyone+0.021
Neg logit contributions
eling-0.018
ited-0.017
!-0.015
ivering-0.012
 unw-0.012
Token her
Token ID607
Feature activation+1.34e-02

Pos logit contributions
eling+0.028
!+0.026
ited+0.024
 bit+0.020
 unw+0.019
Neg logit contributions
 Their-0.036
 Some-0.035
arl-0.035
utes-0.034
itted-0.034
Token "
Token ID366
Feature activation+2.65e-03

Pos logit contributions
eling+0.000
!+0.000
ited+0.000
 unw+0.000
 transported+0.000
Neg logit contributions
 Some-0.000
 Their-0.000
ily-0.000
itted-0.000
year-0.000
TokenHello
Token ID15496
Feature activation+3.46e-03

Pos logit contributions
ited+0.005
eling+0.005
!+0.004
 unw+0.004
ivering+0.004
Neg logit contributions
 Their-0.004
 Some-0.004
ometer-0.004
 lesson-0.004
utes-0.003
Token,
Token ID11
Feature activation+1.43e-03

Pos logit contributions
eling+0.007
ited+0.007
ivering+0.005
ge+0.005
cking+0.005
Neg logit contributions
ily-0.005
 Their-0.005
 Some-0.005
 many-0.005
 Everyone-0.005
Token my
Token ID616
Feature activation-1.85e-02

Pos logit contributions
eling+0.002
ited+0.002
!+0.002
 unw+0.002
 transported+0.002
Neg logit contributions
 Their-0.002
 Some-0.002
arl-0.002
ily-0.002
itted-0.002
Token d
Token ID288
Feature activation-5.14e-02

Pos logit contributions
 Their+0.027
 Some+0.026
ily+0.024
arl+0.024
itted+0.023
Neg logit contributions
eling-0.035
ited-0.032
!-0.030
 unw-0.027
 transported-0.026
Tokenarl
Token ID7063
Feature activation-1.95e-01

Pos logit contributions
 Their+0.037
 lesson+0.035
 Some+0.030
 many+0.024
 His+0.022
Neg logit contributions
eling-0.129
!-0.126
ited-0.121
ivering-0.117
!"-0.111
Tokenings
Token ID654
Feature activation+7.62e-02

Pos logit contributions
 Their+0.429
 Some+0.422
year+0.408
ily+0.403
 Everyone+0.379
Neg logit contributions
eling-0.212
!-0.182
ited-0.182
!"-0.130
 unw-0.126
Token.
Token ID13
Feature activation+2.18e-02

Pos logit contributions
eling+0.082
ited+0.071
!+0.055
 unw+0.048
 transported+0.045
Neg logit contributions
 Their-0.173
 Some-0.169
ily-0.163
arl-0.163
 Everyone-0.158
Token I
Token ID314
Feature activation+1.66e-02

Pos logit contributions
eling+0.031
!+0.030
ited+0.028
 unw+0.025
!"+0.025
Neg logit contributions
 lesson-0.035
 Their-0.034
utes-0.034
itted-0.033
arl-0.033
Token'm
Token ID1101
Feature activation+5.36e-02

Pos logit contributions
eling+0.033
ited+0.030
ries+0.027
!+0.026
ivering+0.026
Neg logit contributions
 Their-0.022
 Some-0.021
ily-0.021
year-0.019
 lesson-0.019
Token so
Token ID523
Feature activation-1.85e-02

Pos logit contributions
eling+0.088
ited+0.087
!+0.068
ge+0.065
 unw+0.063
Neg logit contributions
 Their-0.095
 Some-0.088
ily-0.088
arl-0.083
year-0.083
Token disappointment
Token ID18641
Feature activation+2.41e-02

Pos logit contributions
 Their+0.013
 Some+0.013
arl+0.012
 lesson+0.012
ily+0.012
Neg logit contributions
eling-0.012
ited-0.011
!-0.009
ivering-0.009
 transported-0.008
Token.
Token ID13
Feature activation-7.67e-04

Pos logit contributions
eling+0.032
ited+0.029
!+0.021
 unw+0.020
 transported+0.020
Neg logit contributions
 Their-0.048
 Some-0.046
ily-0.045
arl-0.044
year-0.044
Token
Token ID198
Feature activation-5.23e-03

Pos logit contributions
arl+0.001
utes+0.001
oys+0.001
 lesson+0.001
year+0.001
Neg logit contributions
!-0.001
eling-0.001
ited-0.001
 unw-0.001
 transported-0.001
Token
Token ID198
Feature activation-4.34e-03

Pos logit contributions
tr+0.008
itted+0.008
oys+0.008
 Some+0.008
arl+0.008
Neg logit contributions
eling-0.008
!-0.008
ited-0.007
 unw-0.007
!"-0.006
TokenM
Token ID44
Feature activation-3.11e-02

Pos logit contributions
 Some+0.007
itted+0.007
 Their+0.007
arl+0.007
tr+0.006
Neg logit contributions
eling-0.007
ited-0.006
!-0.006
 unw-0.005
ivering-0.005
Tokenarl
Token ID7063
Feature activation-1.94e-01

Pos logit contributions
 Their+0.043
 Some+0.041
tr+0.037
utes+0.035
itted+0.035
Neg logit contributions
eling-0.064
ited-0.061
!-0.056
 unw-0.049
ivering-0.048
Tokenon
Token ID261
Feature activation-4.31e-02

Pos logit contributions
 Their+0.272
 Some+0.270
year+0.240
ily+0.239
arl+0.236
Neg logit contributions
eling-0.369
ited-0.342
!-0.332
 unw-0.288
ge-0.281
Token's
Token ID338
Feature activation-6.69e-03

Pos logit contributions
 Their+0.084
 Some+0.082
arl+0.079
ily+0.076
itted+0.075
Neg logit contributions
eling-0.060
ited-0.054
!-0.050
 unw-0.042
 transported-0.039
Token mum
Token ID25682
Feature activation+2.61e-02

Pos logit contributions
 Their+0.011
 Some+0.010
arl+0.010
oys+0.009
year+0.009
Neg logit contributions
eling-0.012
ited-0.011
!-0.010
 unw-0.009
 transported-0.009
Token hugged
Token ID41130
Feature activation-2.67e-03

Pos logit contributions
eling+0.039
ited+0.035
!+0.029
ge+0.027
ivering+0.027
Neg logit contributions
 Their-0.045
 Some-0.043
ily-0.042
year-0.042
 Everyone-0.040
Token him
Token ID683
Feature activation-5.96e-04

Pos logit contributions
 Their+0.005
 Some+0.005
utes+0.005
arl+0.005
tr+0.005
Neg logit contributions
eling-0.004
!-0.004
ited-0.003
 unw-0.003
!"-0.002
TokenNot
Token ID3673
Feature activation+8.73e-02

Pos logit contributions
ited+0.035
eling+0.034
 bit+0.033
 unw+0.030
!+0.030
Neg logit contributions
avy-0.024
 Their-0.023
 Everyone-0.021
itted-0.021
 Some-0.021
Token now
Token ID783
Feature activation-2.08e-02

Pos logit contributions
eling+0.126
ited+0.117
!+0.099
 unw+0.089
 transported+0.087
Neg logit contributions
 Their-0.165
 Some-0.159
ily-0.152
arl-0.152
 lesson-0.148
Token,
Token ID11
Feature activation+1.23e-02

Pos logit contributions
 Their+0.032
 Everyone+0.030
arl+0.029
 many+0.029
ily+0.029
Neg logit contributions
ited-0.039
eling-0.039
 transported-0.028
hole-0.027
ivering-0.027
Token my
Token ID616
Feature activation-1.60e-02

Pos logit contributions
eling+0.024
ited+0.022
!+0.020
 unw+0.018
 transported+0.018
Neg logit contributions
 Their-0.018
 Some-0.017
ily-0.016
arl-0.016
 Everyone-0.015
Token d
Token ID288
Feature activation-5.03e-02

Pos logit contributions
 Their+0.023
 Some+0.022
arl+0.021
ily+0.021
 lesson+0.021
Neg logit contributions
eling-0.030
ited-0.028
!-0.025
 unw-0.023
 transported-0.023
Tokenarl
Token ID7063
Feature activation-1.93e-01

Pos logit contributions
 Their+0.035
 lesson+0.032
 Some+0.027
 many+0.022
 Everyone+0.021
Neg logit contributions
eling-0.132
ited-0.125
!-0.124
ivering-0.122
ries-0.110
Tokenings
Token ID654
Feature activation+7.76e-02

Pos logit contributions
 Their+0.431
 Some+0.425
year+0.415
ily+0.404
 Everyone+0.385
Neg logit contributions
eling-0.206
ited-0.174
!-0.173
!"-0.123
 unw-0.122
Token,
Token ID11
Feature activation+1.37e-02

Pos logit contributions
eling+0.089
ited+0.079
!+0.074
 unw+0.056
!"+0.053
Neg logit contributions
 Their-0.169
 Some-0.165
itted-0.156
year-0.155
ily-0.154
Token it
Token ID340
Feature activation-1.94e-02

Pos logit contributions
eling+0.023
ited+0.021
!+0.020
 unw+0.017
 transported+0.017
Neg logit contributions
 Their-0.022
 Some-0.022
arl-0.020
ily-0.020
itted-0.020
Token is
Token ID318
Feature activation+2.56e-02

Pos logit contributions
tr+0.028
 Their+0.025
 lesson+0.025
 Some+0.024
ily+0.024
Neg logit contributions
ited-0.034
eling-0.033
ge-0.032
 avail-0.030
ishment-0.029
Token too
Token ID1165
Feature activation+5.90e-03

Pos logit contributions
eling+0.048
ited+0.048
ge+0.038
!+0.037
 transported+0.037
Neg logit contributions
 Their-0.038
 lesson-0.034
 Some-0.034
ily-0.034
 many-0.034
TokenWhat
Token ID2061
Feature activation-1.20e-03

Pos logit contributions
eling+0.011
ited+0.010
!+0.009
 unw+0.008
ge+0.008
Neg logit contributions
 Their-0.009
 Some-0.008
itted-0.008
arl-0.008
utes-0.008
Token happened
Token ID3022
Feature activation+2.49e-02

Pos logit contributions
 Their+0.002
 Some+0.002
ily+0.002
 lesson+0.002
 many+0.002
Neg logit contributions
eling-0.002
ited-0.002
!-0.002
ivering-0.002
 unw-0.002
Token,
Token ID11
Feature activation+5.59e-03

Pos logit contributions
eling+0.035
ited+0.032
 unw+0.022
ivering+0.022
 transported+0.021
Neg logit contributions
 Their-0.048
 Some-0.047
ily-0.047
arl-0.046
oys-0.045
Token my
Token ID616
Feature activation-2.14e-02

Pos logit contributions
ited+0.011
eling+0.011
!+0.009
ge+0.009
 unw+0.009
Neg logit contributions
 Their-0.007
 Some-0.007
ily-0.007
arl-0.007
 Everyone-0.006
Token d
Token ID288
Feature activation-5.13e-02

Pos logit contributions
 Their+0.032
 Some+0.030
arl+0.028
ily+0.028
itted+0.028
Neg logit contributions
eling-0.040
ited-0.037
!-0.034
 unw-0.031
 transported-0.030
Tokenarl
Token ID7063
Feature activation-1.93e-01

Pos logit contributions
 lesson+0.037
 Their+0.036
 Some+0.030
 many+0.026
year+0.022
Neg logit contributions
eling-0.128
!-0.124
ivering-0.121
ited-0.118
!"-0.110
Tokenings
Token ID654
Feature activation+7.56e-02

Pos logit contributions
 Their+0.424
 Some+0.419
year+0.407
ily+0.398
 Everyone+0.376
Neg logit contributions
eling-0.208
!-0.182
ited-0.179
!"-0.130
 unw-0.127
Token?
Token ID30
Feature activation+3.23e-02

Pos logit contributions
eling+0.062
ited+0.051
!+0.047
 unw+0.028
!"+0.026
Neg logit contributions
 Their-0.189
 Some-0.186
itted-0.177
year-0.176
ily-0.175
Token Why
Token ID4162
Feature activation-7.91e-03

Pos logit contributions
eling+0.060
!+0.056
ited+0.052
!"+0.048
 unw+0.048
Neg logit contributions
 lesson-0.042
 Their-0.038
utes-0.038
itted-0.038
arl-0.037
Token are
Token ID389
Feature activation+4.81e-02

Pos logit contributions
 Their+0.013
ily+0.012
 Some+0.012
 lesson+0.011
 many+0.011
Neg logit contributions
ited-0.014
eling-0.014
ivering-0.011
!-0.011
 unw-0.011
Token you
Token ID345
Feature activation-4.16e-02

Pos logit contributions
eling+0.054
!+0.039
ited+0.036
ries+0.030
 unw+0.030
Neg logit contributions
 Some-0.105
 Their-0.105
itted-0.105
arl-0.104
year-0.103
Token'm
Token ID1101
Feature activation+5.08e-02

Pos logit contributions
eling+0.046
ited+0.041
!+0.037
ivering+0.035
ries+0.035
Neg logit contributions
 Their-0.036
 Some-0.035
ily-0.032
 lesson-0.031
year-0.031
Token sorry
Token ID7926
Feature activation-1.46e-02

Pos logit contributions
eling+0.104
ited+0.097
!+0.090
 unw+0.081
 transported+0.079
Neg logit contributions
 Their-0.067
 Some-0.064
ily-0.059
arl-0.058
itted-0.057
Token,
Token ID11
Feature activation+6.59e-03

Pos logit contributions
 Their+0.029
 Some+0.028
ily+0.027
 Everyone+0.027
arl+0.026
Neg logit contributions
eling-0.021
ited-0.021
 unw-0.014
 transported-0.014
!-0.013
Token my
Token ID616
Feature activation-1.44e-02

Pos logit contributions
eling+0.013
ited+0.013
!+0.011
 unw+0.010
 transported+0.010
Neg logit contributions
 Their-0.009
 Some-0.009
ily-0.008
arl-0.008
 Everyone-0.008
Token d
Token ID288
Feature activation-5.01e-02

Pos logit contributions
 Their+0.020
 Some+0.020
ily+0.018
arl+0.018
 Everyone+0.017
Neg logit contributions
eling-0.029
ited-0.026
!-0.023
 unw-0.022
 transported-0.022
Tokenarl
Token ID7063
Feature activation-1.93e-01

Pos logit contributions
 lesson+0.036
 Their+0.036
 Some+0.028
 many+0.024
 Everyone+0.023
Neg logit contributions
!-0.120
eling-0.120
ivering-0.115
ited-0.114
!"-0.106
Tokenings
Token ID654
Feature activation+7.70e-02

Pos logit contributions
 Their+0.427
 Some+0.420
year+0.406
ily+0.399
 Everyone+0.380
Neg logit contributions
eling-0.206
ited-0.179
!-0.177
 unw-0.126
!"-0.126
Token.
Token ID13
Feature activation+1.92e-02

Pos logit contributions
eling+0.084
ited+0.073
!+0.063
 unw+0.050
 transported+0.047
Neg logit contributions
 Their-0.173
 Some-0.168
arl-0.160
ily-0.160
itted-0.158
Token The
Token ID383
Feature activation+6.02e-03

Pos logit contributions
eling+0.023
!+0.023
!"+0.020
ited+0.018
 unw+0.018
Neg logit contributions
 lesson-0.033
utes-0.032
year-0.032
Years-0.031
itted-0.031
Token tornado
Token ID33718
Feature activation+6.22e-02

Pos logit contributions
eling+0.010
ited+0.009
!+0.009
 unw+0.008
 mess+0.007
Neg logit contributions
 Their-0.010
 Some-0.009
year-0.009
itted-0.009
arl-0.009
Token took
Token ID1718
Feature activation-1.14e-02

Pos logit contributions
eling+0.118
ited+0.115
ge+0.100
ivering+0.098
 avail+0.096
Neg logit contributions
 lesson-0.080
 Their-0.080
tr-0.075
ily-0.072
 Some-0.070
TokenWe
Token ID1135
Feature activation+1.26e-02

Pos logit contributions
eling+0.012
ited+0.011
!+0.010
 bit+0.009
 unw+0.009
Neg logit contributions
 Their-0.011
arl-0.010
 Some-0.010
tr-0.010
avy-0.010
Token know
Token ID760
Feature activation+1.89e-02

Pos logit contributions
eling+0.024
ited+0.022
!+0.020
 transported+0.019
ivering+0.019
Neg logit contributions
 Their-0.019
 Some-0.018
ily-0.017
 lesson-0.016
itted-0.016
Token,
Token ID11
Feature activation+1.03e-02

Pos logit contributions
ited+0.034
eling+0.032
ivering+0.026
 transported+0.025
ge+0.024
Neg logit contributions
 Their-0.032
 many-0.029
 Some-0.029
ily-0.029
 Everyone-0.028
Token my
Token ID616
Feature activation-1.55e-02

Pos logit contributions
eling+0.017
ited+0.017
!+0.014
 unw+0.013
ge+0.013
Neg logit contributions
 Their-0.017
 Some-0.017
ily-0.016
arl-0.015
 lesson-0.015
Token d
Token ID288
Feature activation-4.79e-02

Pos logit contributions
 Their+0.023
 Some+0.022
ily+0.020
arl+0.020
 Everyone+0.019
Neg logit contributions
eling-0.030
ited-0.028
!-0.025
 unw-0.023
 transported-0.022
Tokenarl
Token ID7063
Feature activation-1.93e-01

Pos logit contributions
 Their+0.036
 lesson+0.034
 Some+0.027
 many+0.024
 His+0.023
Neg logit contributions
eling-0.117
ivering-0.111
!-0.111
ited-0.110
ries-0.100
Tokenings
Token ID654
Feature activation+7.74e-02

Pos logit contributions
 Their+0.424
 Some+0.419
year+0.404
ily+0.400
 Everyone+0.379
Neg logit contributions
eling-0.208
ited-0.179
!-0.176
 unw-0.126
!"-0.124
Token.
Token ID13
Feature activation+2.31e-02

Pos logit contributions
eling+0.084
ited+0.073
!+0.066
 unw+0.050
ivering+0.046
Neg logit contributions
 Their-0.173
 Some-0.169
arl-0.160
ily-0.160
itted-0.159
Token We
Token ID775
Feature activation+1.50e-02

Pos logit contributions
eling+0.031
!+0.031
!"+0.027
ited+0.025
!".+0.025
Neg logit contributions
year-0.035
 lesson-0.035
itted-0.035
utes-0.034
arl-0.034
Token miss
Token ID2051
Feature activation+8.51e-02

Pos logit contributions
eling+0.028
ited+0.026
!+0.023
 transported+0.022
ivering+0.022
Neg logit contributions
 Their-0.022
 Some-0.021
ily-0.020
 lesson-0.019
 Everyone-0.019
Token them
Token ID606
Feature activation-4.49e-03

Pos logit contributions
eling+0.139
ited+0.129
!+0.113
 unw+0.102
 transported+0.099
Neg logit contributions
 Their-0.146
 Some-0.140
ily-0.131
arl-0.131
 Everyone-0.128
TokenSpot
Token ID32565
Feature activation+2.09e-02

Pos logit contributions
 Some+0.007
 Their+0.007
itted+0.007
year+0.007
avy+0.007
Neg logit contributions
eling-0.009
ited-0.008
!-0.008
ivering-0.007
!"-0.007
Token grow
Token ID1663
Feature activation+4.96e-02

Pos logit contributions
ited+0.041
ge+0.038
eling+0.038
ivering+0.036
hole+0.034
Neg logit contributions
 Their-0.028
 Some-0.027
 They-0.026
 many-0.025
 Now-0.024
Tokenls
Token ID7278
Feature activation+3.10e-02

Pos logit contributions
eling+0.062
ited+0.057
!+0.048
 unw+0.042
 transported+0.038
Neg logit contributions
 Their-0.104
 Some-0.102
ily-0.097
arl-0.096
 Everyone-0.094
Token and
Token ID290
Feature activation-3.13e-03

Pos logit contributions
eling+0.059
ited+0.052
ivering+0.040
 unw+0.039
ge+0.039
Neg logit contributions
 Their-0.050
 Some-0.048
ily-0.047
 many-0.045
itted-0.043
Token sn
Token ID3013
Feature activation-7.48e-03

Pos logit contributions
 Their+0.005
 Some+0.005
 many+0.004
 They+0.004
 lesson+0.004
Neg logit contributions
ited-0.005
eling-0.005
ivering-0.005
ge-0.004
!-0.004
Tokenarl
Token ID7063
Feature activation-1.92e-01

Pos logit contributions
 Some+0.012
 Their+0.012
 lesson+0.011
 Many+0.011
itted+0.011
Neg logit contributions
eling-0.012
!-0.011
ited-0.011
!"-0.009
ivering-0.009
Tokens
Token ID82
Feature activation+1.80e-02

Pos logit contributions
year+0.441
 Some+0.435
 Their+0.433
ily+0.409
tr+0.404
Neg logit contributions
eling-0.169
!-0.165
ited-0.121
!"-0.116
 lands-0.105
Token.
Token ID13
Feature activation+9.71e-03

Pos logit contributions
eling+0.027
ited+0.025
!+0.022
 unw+0.019
ivering+0.018
Neg logit contributions
 Their-0.033
 Some-0.032
ily-0.031
arl-0.030
itted-0.030
Token He
Token ID679
Feature activation-1.66e-02

Pos logit contributions
!+0.016
eling+0.015
 unw+0.014
ited+0.014
 sigh+0.013
Neg logit contributions
utes-0.014
year-0.013
avy-0.012
oys-0.012
itted-0.012
Token tries
Token ID8404
Feature activation-2.88e-02

Pos logit contributions
 Their+0.027
 Some+0.027
 many+0.024
ily+0.023
 lesson+0.023
Neg logit contributions
ited-0.028
eling-0.026
ivering-0.025
!-0.023
ge-0.023
Token to
Token ID284
Feature activation+9.62e-04

Pos logit contributions
 Their+0.043
year+0.043
arl+0.043
 Some+0.042
utes+0.041
Neg logit contributions
eling-0.047
!-0.046
ited-0.042
 unw-0.038
 bit-0.037
Token'm
Token ID1101
Feature activation+5.67e-02

Pos logit contributions
eling+0.053
ited+0.048
!+0.043
 unw+0.040
 transported+0.039
Neg logit contributions
 Their-0.043
 Some-0.041
ily-0.038
arl-0.037
 lesson-0.037
Token sorry
Token ID7926
Feature activation-1.25e-02

Pos logit contributions
eling+0.118
ited+0.109
!+0.102
 unw+0.092
 transported+0.090
Neg logit contributions
 Their-0.073
 Some-0.070
arl-0.064
ily-0.063
itted-0.062
Token,
Token ID11
Feature activation+6.78e-03

Pos logit contributions
 Their+0.023
 Some+0.023
ily+0.022
arl+0.021
 Everyone+0.021
Neg logit contributions
ited-0.020
eling-0.020
 transported-0.014
 unw-0.014
ge-0.013
Token my
Token ID616
Feature activation-1.15e-02

Pos logit contributions
eling+0.013
ited+0.012
!+0.012
 unw+0.011
!"+0.010
Neg logit contributions
 Their-0.009
 Some-0.009
year-0.008
itted-0.008
arl-0.008
Token d
Token ID288
Feature activation-4.95e-02

Pos logit contributions
 Their+0.018
 Some+0.017
arl+0.016
ily+0.016
itted+0.016
Neg logit contributions
eling-0.021
ited-0.019
!-0.018
 unw-0.016
 transported-0.015
Tokenarl
Token ID7063
Feature activation-1.91e-01

Pos logit contributions
 Their+0.042
 lesson+0.038
 Some+0.032
 many+0.028
 Everyone+0.028
Neg logit contributions
eling-0.120
!-0.117
ivering-0.114
ited-0.113
!"-0.103
Tokenings
Token ID654
Feature activation+7.86e-02

Pos logit contributions
 Their+0.426
 Some+0.419
year+0.408
ily+0.395
 Everyone+0.380
Neg logit contributions
eling-0.202
ited-0.177
!-0.177
!"-0.125
 unw-0.124
Token.
Token ID13
Feature activation+1.90e-02

Pos logit contributions
eling+0.076
ited+0.066
!+0.065
!"+0.043
 unw+0.042
Neg logit contributions
 Their-0.183
 Some-0.178
itted-0.171
utes-0.170
year-0.170
Token That
Token ID1320
Feature activation-3.72e-02

Pos logit contributions
eling+0.027
!+0.027
!"+0.023
ited+0.023
 unw+0.021
Neg logit contributions
utes-0.030
 lesson-0.029
year-0.029
itted-0.028
Years-0.028
Token was
Token ID373
Feature activation+2.07e-02

Pos logit contributions
 Their+0.069
 Some+0.067
 lesson+0.064
ily+0.062
tr+0.062
Neg logit contributions
eling-0.057
ited-0.053
!-0.043
 unw-0.040
ge-0.040
Token a
Token ID257
Feature activation+2.22e-02

Pos logit contributions
eling+0.035
!+0.030
ited+0.029
 unw+0.025
 bit+0.025
Neg logit contributions
 Their-0.034
 Some-0.034
arl-0.033
itted-0.033
utes-0.032
Token you
Token ID345
Feature activation-3.88e-02

Pos logit contributions
eling+0.008
ited+0.008
!+0.007
 unw+0.006
 transported+0.005
Neg logit contributions
 Their-0.012
 Some-0.011
ily-0.011
arl-0.011
itted-0.010
Token okay
Token ID8788
Feature activation+2.11e-02

Pos logit contributions
 Their+0.037
ily+0.035
arl+0.034
itted+0.033
 Some+0.033
Neg logit contributions
eling-0.089
ited-0.084
 unw-0.073
ge-0.072
!-0.072
Token,
Token ID11
Feature activation+2.85e-03

Pos logit contributions
eling+0.038
ited+0.037
 unw+0.026
 transported+0.025
hole+0.025
Neg logit contributions
 Their-0.037
arl-0.034
 Some-0.034
ily-0.033
oys-0.032
Token my
Token ID616
Feature activation-1.17e-02

Pos logit contributions
eling+0.006
ited+0.005
!+0.005
 unw+0.004
 transported+0.004
Neg logit contributions
 Their-0.004
 Some-0.004
arl-0.004
ily-0.004
itted-0.003
Token d
Token ID288
Feature activation-5.27e-02

Pos logit contributions
 Their+0.018
 Some+0.018
arl+0.016
ily+0.016
itted+0.016
Neg logit contributions
eling-0.021
ited-0.019
!-0.018
 unw-0.016
 transported-0.015
Tokenarl
Token ID7063
Feature activation-1.91e-01

Pos logit contributions
 Their+0.041
 lesson+0.038
 Some+0.033
 many+0.028
year+0.028
Neg logit contributions
eling-0.130
!-0.127
ivering-0.124
ited-0.123
!"-0.112
Tokenings
Token ID654
Feature activation+7.78e-02

Pos logit contributions
 Their+0.422
 Some+0.415
year+0.405
ily+0.392
 Everyone+0.376
Neg logit contributions
eling-0.206
ited-0.180
!-0.179
!"-0.127
 unw-0.127
Token?"
Token ID1701
Feature activation+1.82e-02

Pos logit contributions
eling+0.063
!+0.054
ited+0.053
!"+0.032
ivering+0.029
Neg logit contributions
 Their-0.194
 Some-0.191
itted-0.185
year-0.181
utes-0.180
Token Mom
Token ID11254
Feature activation+2.33e-02

Pos logit contributions
eling+0.029
!+0.028
ited+0.025
 unw+0.024
 bit+0.022
Neg logit contributions
tr-0.028
itted-0.028
arl-0.028
utes-0.027
year-0.027
Token asked
Token ID1965
Feature activation+2.95e-02

Pos logit contributions
eling+0.040
ited+0.036
!+0.032
 transported+0.029
 unw+0.028
Neg logit contributions
 Their-0.039
 Some-0.037
ily-0.036
year-0.034
arl-0.033
Token.
Token ID13
Feature activation-4.59e-04

Pos logit contributions
eling+0.035
ited+0.031
!+0.025
 unw+0.021
 transported+0.020
Neg logit contributions
 Their-0.065
 Some-0.063
ily-0.060
arl-0.060
itted-0.058
Token you
Token ID345
Feature activation-3.70e-02

Pos logit contributions
eling+0.066
ited+0.057
!+0.053
 unw+0.043
 transported+0.039
Neg logit contributions
 Their-0.102
 Some-0.099
arl-0.094
ily-0.094
itted-0.093
Token too
Token ID1165
Feature activation+4.58e-03

Pos logit contributions
 Their+0.071
 Some+0.069
ily+0.064
arl+0.062
 many+0.062
Neg logit contributions
eling-0.058
ited-0.052
 transported-0.041
ge-0.040
!-0.039
Token,
Token ID11
Feature activation+8.55e-03

Pos logit contributions
eling+0.008
ited+0.008
 transported+0.006
 emerge+0.006
 unw+0.006
Neg logit contributions
 Their-0.008
 Some-0.008
 many-0.008
arl-0.007
ily-0.007
Token my
Token ID616
Feature activation-1.32e-02

Pos logit contributions
eling+0.016
ited+0.015
!+0.014
 unw+0.012
 transported+0.012
Neg logit contributions
 Their-0.012
 Some-0.012
arl-0.011
ily-0.011
itted-0.011
Token d
Token ID288
Feature activation-4.86e-02

Pos logit contributions
 Their+0.017
 Some+0.016
 lesson+0.016
ily+0.015
arl+0.014
Neg logit contributions
eling-0.028
ited-0.025
 transported-0.022
ge-0.022
!-0.022
Tokenarl
Token ID7063
Feature activation-1.91e-01

Pos logit contributions
 Their+0.040
 lesson+0.035
 Some+0.031
 His+0.027
 many+0.027
Neg logit contributions
eling-0.120
ited-0.117
!-0.113
ivering-0.112
ries-0.101
Tokenings
Token ID654
Feature activation+7.93e-02

Pos logit contributions
 Their+0.418
 Some+0.411
year+0.400
ily+0.390
 Everyone+0.374
Neg logit contributions
eling-0.209
ited-0.184
!-0.178
 unw-0.130
!"-0.126
Token.
Token ID13
Feature activation+2.06e-02

Pos logit contributions
eling+0.073
!+0.066
ited+0.064
!"+0.045
 unw+0.043
Neg logit contributions
 Their-0.185
 Some-0.178
itted-0.173
utes-0.172
year-0.171
Token Now
Token ID2735
Feature activation-5.58e-02

Pos logit contributions
!+0.034
eling+0.033
ited+0.030
!"+0.030
 unw+0.028
Neg logit contributions
utes-0.026
 lesson-0.026
 Their-0.025
year-0.025
itted-0.025
Token let
Token ID1309
Feature activation+6.85e-03

Pos logit contributions
 Their+0.077
ily+0.075
 many+0.073
 Everyone+0.072
 bye+0.070
Neg logit contributions
ited-0.108
eling-0.107
ivering-0.088
ge-0.082
 transported-0.078
Token's
Token ID338
Feature activation+6.49e-04

Pos logit contributions
eling+0.007
ited+0.006
!+0.005
 unw+0.004
 transported+0.004
Neg logit contributions
 Their-0.016
 Some-0.015
ily-0.015
 Everyone-0.015
arl-0.015
Token's
Token ID338
Feature activation+1.99e-04

Pos logit contributions
 Their+0.072
 Some+0.070
 lesson+0.068
tr+0.065
ily+0.065
Neg logit contributions
eling-0.051
ited-0.048
ge-0.036
!-0.035
 unw-0.035
Token good
Token ID922
Feature activation+2.11e-02

Pos logit contributions
eling+0.000
ited+0.000
!+0.000
 unw+0.000
 transported+0.000
Neg logit contributions
 Their-0.000
 Some-0.000
ily-0.000
arl-0.000
itted-0.000
Token,
Token ID11
Feature activation+6.40e-03

Pos logit contributions
eling+0.024
ited+0.021
!+0.018
 unw+0.015
 transported+0.014
Neg logit contributions
 Their-0.047
 Some-0.046
ily-0.044
arl-0.044
itted-0.043
Token my
Token ID616
Feature activation-1.32e-02

Pos logit contributions
eling+0.013
ited+0.012
!+0.011
 unw+0.010
 transported+0.010
Neg logit contributions
 Their-0.008
 Some-0.008
ily-0.007
arl-0.007
itted-0.007
Token d
Token ID288
Feature activation-5.02e-02

Pos logit contributions
 Their+0.018
 Some+0.018
arl+0.016
ily+0.016
itted+0.016
Neg logit contributions
eling-0.026
ited-0.024
!-0.022
 unw-0.020
 transported-0.019
Tokenarl
Token ID7063
Feature activation-1.91e-01

Pos logit contributions
 Their+0.051
 lesson+0.043
 Some+0.043
 His+0.038
year+0.036
Neg logit contributions
eling-0.116
!-0.110
ited-0.107
ivering-0.101
!"-0.095
Tokenings
Token ID654
Feature activation+7.88e-02

Pos logit contributions
 Their+0.389
 Some+0.378
ily+0.358
arl+0.355
year+0.352
Neg logit contributions
eling-0.249
ited-0.222
!-0.199
 unw-0.165
 transported-0.156
Token.
Token ID13
Feature activation+1.99e-02

Pos logit contributions
eling+0.096
ited+0.084
!+0.066
 unw+0.059
 transported+0.059
Neg logit contributions
 Their-0.170
 Some-0.166
ily-0.160
arl-0.159
 Everyone-0.154
Token I
Token ID314
Feature activation+1.91e-02

Pos logit contributions
!+0.030
eling+0.029
!"+0.026
ited+0.025
 unw+0.024
Neg logit contributions
 lesson-0.030
utes-0.027
year-0.027
itted-0.027
Years-0.026
Token'm
Token ID1101
Feature activation+5.66e-02

Pos logit contributions
eling+0.035
ited+0.031
!+0.028
ries+0.027
 transported+0.026
Neg logit contributions
 Their-0.029
 Some-0.028
ily-0.026
 lesson-0.026
year-0.025
Token glad
Token ID9675
Feature activation+2.16e-02

Pos logit contributions
eling+0.111
ited+0.108
!+0.095
 unw+0.087
ge+0.086
Neg logit contributions
 Their-0.079
 Some-0.074
ily-0.070
arl-0.068
year-0.067
Token there
Token ID612
Feature activation+7.63e-03

Pos logit contributions
eling+0.015
ited+0.013
!+0.012
 unw+0.010
 transported+0.009
Neg logit contributions
 Their-0.025
 Some-0.024
arl-0.023
ily-0.023
itted-0.023
Token was
Token ID373
Feature activation-1.09e-03

Pos logit contributions
eling+0.015
ited+0.014
ivering+0.012
 transported+0.011
 unw+0.011
Neg logit contributions
 Their-0.012
 Some-0.011
ily-0.011
 His-0.010
 Everyone-0.010
Token a
Token ID257
Feature activation+3.94e-03

Pos logit contributions
arl+0.002
 Their+0.002
oys+0.002
 Some+0.002
ometer+0.002
Neg logit contributions
eling-0.002
!-0.002
ited-0.002
 unw-0.001
!"-0.001
Token 4
Token ID604
Feature activation-3.27e-02

Pos logit contributions
eling+0.007
ited+0.007
!+0.006
 unw+0.006
ivering+0.005
Neg logit contributions
 Their-0.006
 Some-0.006
arl-0.005
itted-0.005
ily-0.005
Token-
Token ID12
Feature activation-8.50e-02

Pos logit contributions
 Their+0.060
 Some+0.058
ily+0.055
arl+0.055
year+0.054
Neg logit contributions
eling-0.049
ited-0.045
!-0.040
 unw-0.035
 transported-0.033
Tokenyear
Token ID1941
Feature activation-1.90e-01

Pos logit contributions
arl+0.137
 Their+0.129
 Some+0.113
itted+0.113
der+0.111
Neg logit contributions
ited-0.143
eling-0.135
!-0.126
 unw-0.110
ge-0.107
Token old
Token ID1468
Feature activation-3.33e-02

Pos logit contributions
 Their+0.404
 Some+0.396
arl+0.366
ily+0.366
itted+0.354
Neg logit contributions
eling-0.216
ited-0.203
!-0.196
!"-0.144
 unw-0.143
Token girl
Token ID2576
Feature activation-8.13e-03

Pos logit contributions
 Their+0.050
 Some+0.048
arl+0.045
ily+0.044
year+0.044
Neg logit contributions
eling-0.061
ited-0.056
!-0.054
 unw-0.047
 transported-0.044
Token named
Token ID3706
Feature activation-1.24e-02

Pos logit contributions
 Their+0.011
 Some+0.011
ily+0.010
itted+0.009
 Everyone+0.009
Neg logit contributions
eling-0.016
ited-0.015
!-0.013
 unw-0.012
 transported-0.012
Token Sus
Token ID8932
Feature activation+2.71e-02

Pos logit contributions
 Their+0.019
itted+0.018
 Some+0.018
arl+0.018
utes+0.017
Neg logit contributions
eling-0.021
!-0.020
ited-0.019
!"-0.017
 unw-0.017
Tokenie
Token ID494
Feature activation-1.58e-02

Pos logit contributions
eling+0.030
ited+0.027
!+0.023
 transported+0.021
 unw+0.020
Neg logit contributions
 Their-0.057
ily-0.056
 Some-0.055
year-0.054
arl-0.053
Token.
Token ID13
Feature activation-9.79e-03

Pos logit contributions
 Their+0.063
 Some+0.060
ily+0.059
 lesson+0.058
 many+0.057
Neg logit contributions
eling-0.044
ited-0.041
ivering-0.028
!-0.028
 unw-0.028
Token He
Token ID679
Feature activation-3.00e-02

Pos logit contributions
arl+0.016
utes+0.015
year+0.014
ula+0.013
Their+0.013
Neg logit contributions
!-0.015
 unw-0.013
!"-0.012
eling-0.012
 bit-0.012
Token thought
Token ID1807
Feature activation-4.97e-02

Pos logit contributions
 Their+0.058
 Some+0.056
arl+0.054
ily+0.053
itted+0.052
Neg logit contributions
eling-0.042
ited-0.039
!-0.035
 unw-0.030
 transported-0.028
Token for
Token ID329
Feature activation-3.70e-02

Pos logit contributions
 Their+0.091
arl+0.090
 Some+0.087
itted+0.086
tr+0.085
Neg logit contributions
eling-0.071
!-0.067
ited-0.058
 unw-0.054
ge-0.049
Token a
Token ID257
Feature activation+4.19e-03

Pos logit contributions
 Their+0.048
 Some+0.046
arl+0.044
itted+0.042
ily+0.042
Neg logit contributions
eling-0.076
ited-0.070
!-0.067
 unw-0.060
 transported-0.058
Token while
Token ID981
Feature activation+5.61e-02

Pos logit contributions
eling+0.006
ited+0.006
!+0.006
 bit+0.005
 unw+0.005
Neg logit contributions
 Their-0.007
 Some-0.007
itted-0.007
ily-0.007
arl-0.007
Token and
Token ID290
Feature activation-6.22e-03

Pos logit contributions
!+0.051
eling+0.040
ited+0.035
!"+0.026
 bit+0.026
Neg logit contributions
arl-0.138
 Their-0.137
utes-0.131
itted-0.130
 Some-0.130
Token then
Token ID788
Feature activation-3.83e-02

Pos logit contributions
arl+0.014
 Their+0.013
 Some+0.013
utes+0.013
itted+0.012
Neg logit contributions
eling-0.008
!-0.007
ited-0.007
 unw-0.006
 bit-0.005
Token he
Token ID339
Feature activation+1.08e-02

Pos logit contributions
arl+0.073
 Their+0.072
 Some+0.070
utes+0.066
itted+0.066
Neg logit contributions
eling-0.056
!-0.050
ited-0.050
 unw-0.043
 transported-0.041
Token got
Token ID1392
Feature activation-2.38e-02

Pos logit contributions
eling+0.015
ited+0.013
!+0.012
 unw+0.010
ivering+0.010
Neg logit contributions
 Their-0.021
 Some-0.021
ily-0.020
arl-0.019
itted-0.019
Token an
Token ID281
Feature activation+4.90e-02

Pos logit contributions
 Their+0.032
 Some+0.032
ily+0.029
year+0.029
 many+0.028
Neg logit contributions
eling-0.047
ited-0.045
!-0.040
ivering-0.039
ge-0.036
TokenOnce
Token ID7454
Feature activation+2.26e-02

Pos logit contributions
eling+0.067
ited+0.062
!+0.057
 unw+0.051
 transported+0.049
Neg logit contributions
 Their-0.057
 Some-0.055
arl-0.051
ily-0.051
itted-0.050
Token there
Token ID612
Feature activation+1.74e-02

Pos logit contributions
eling+0.032
ited+0.028
!+0.024
 unw+0.022
 transported+0.021
Neg logit contributions
 Their-0.045
 Some-0.043
ily-0.041
arl-0.041
itted-0.039
Token was
Token ID373
Feature activation+1.13e-02

Pos logit contributions
eling+0.038
ited+0.035
ivering+0.031
 transported+0.030
eking+0.029
Neg logit contributions
 Their-0.024
ily-0.021
 His-0.021
 Some-0.020
 Everyone-0.020
Token an
Token ID281
Feature activation+5.54e-02

Pos logit contributions
eling+0.019
!+0.018
ited+0.018
 unw+0.016
!"+0.015
Neg logit contributions
arl-0.017
 Their-0.016
oys-0.016
 Some-0.016
ometer-0.015
Token angry
Token ID7954
Feature activation-2.74e-02

Pos logit contributions
eling+0.098
ited+0.091
!+0.085
 unw+0.074
ivering+0.073
Neg logit contributions
 Their-0.086
 Some-0.083
arl-0.079
itted-0.077
 Everyone-0.075
Token kitten
Token ID42143
Feature activation+6.00e-02

Pos logit contributions
 Their+0.046
 Some+0.046
year+0.044
itted+0.043
arl+0.041
Neg logit contributions
!-0.042
eling-0.042
ited-0.038
!"-0.033
 mess-0.033
Token.
Token ID13
Feature activation+3.05e-03

Pos logit contributions
eling+0.093
ited+0.085
!+0.067
 unw+0.064
 transported+0.063
Neg logit contributions
 Their-0.111
 Some-0.106
ily-0.102
 Everyone-0.098
arl-0.098
Token It
Token ID632
Feature activation-1.83e-02

Pos logit contributions
eling+0.007
ited+0.006
!+0.006
 unw+0.006
ivering+0.005
Neg logit contributions
 Their-0.003
 Some-0.003
arl-0.003
 lesson-0.003
itted-0.003
Token felt
Token ID2936
Feature activation-2.41e-02

Pos logit contributions
 Their+0.032
 Some+0.031
ily+0.029
 Everyone+0.028
 lesson+0.028
Neg logit contributions
eling-0.030
ited-0.028
!-0.024
 unw-0.022
ge-0.021
Token so
Token ID523
Feature activation-2.54e-02

Pos logit contributions
 Their+0.044
 Some+0.043
arl+0.041
oys+0.040
itted+0.040
Neg logit contributions
eling-0.036
ited-0.032
!-0.031
 unw-0.025
ivering-0.024
Token mad
Token ID8805
Feature activation-2.11e-02

Pos logit contributions
 Some+0.044
 Their+0.043
oys+0.041
tr+0.040
utes+0.040
Neg logit contributions
eling-0.039
ited-0.035
!-0.034
 unw-0.028
ivering-0.028
Token past
Token ID1613
Feature activation-5.20e-03

Pos logit contributions
 Their+0.000
 Some+0.000
ily+0.000
year+0.000
arl+0.000
Neg logit contributions
eling-0.000
ited-0.000
!-0.000
 unw-0.000
 transported-0.000
Tokenries
Token ID1678
Feature activation+5.85e-02

Pos logit contributions
 Their+0.011
 Some+0.011
ily+0.011
arl+0.010
 Everyone+0.010
Neg logit contributions
eling-0.006
ited-0.005
!-0.004
 unw-0.004
ivering-0.004
Token.
Token ID13
Feature activation+8.55e-03

Pos logit contributions
eling+0.094
ited+0.085
 transported+0.064
ivering+0.062
 unw+0.059
Neg logit contributions
 Their-0.107
ily-0.099
 Some-0.097
arl-0.096
 many-0.096
Token He
Token ID679
Feature activation-1.57e-02

Pos logit contributions
eling+0.017
!+0.016
ited+0.015
 unw+0.015
!"+0.014
Neg logit contributions
utes-0.009
itted-0.009
oys-0.008
year-0.008
tr-0.008
Token has
Token ID468
Feature activation+5.68e-03

Pos logit contributions
 Their+0.025
 Some+0.025
ily+0.022
 many+0.022
 lesson+0.021
Neg logit contributions
ited-0.026
eling-0.025
ivering-0.023
ge-0.022
!-0.022
Token sold
Token ID2702
Feature activation+6.13e-02

Pos logit contributions
eling+0.010
ited+0.009
!+0.009
 unw+0.008
ivering+0.007
Neg logit contributions
 Their-0.009
 Some-0.009
ily-0.008
arl-0.008
itted-0.008
Token them
Token ID606
Feature activation-8.24e-03

Pos logit contributions
eling+0.107
ited+0.098
!+0.092
 unw+0.081
 transported+0.077
Neg logit contributions
 Their-0.097
 Some-0.093
arl-0.087
ily-0.086
itted-0.085
Token all
Token ID477
Feature activation+2.31e-03

Pos logit contributions
 Their+0.014
ily+0.013
 Some+0.013
 many+0.013
 lesson+0.013
Neg logit contributions
eling-0.014
ited-0.013
ivering-0.010
 transported-0.010
 unw-0.010
Token to
Token ID284
Feature activation-2.88e-04

Pos logit contributions
eling+0.004
ited+0.004
ivering+0.003
 transported+0.003
 unw+0.003
Neg logit contributions
 Their-0.004
 many-0.004
 lesson-0.004
 Some-0.004
ily-0.004
Token other
Token ID584
Feature activation-6.15e-02

Pos logit contributions
 Some+0.000
itted+0.000
 Their+0.000
arl+0.000
year+0.000
Neg logit contributions
eling-0.001
!-0.000
ited-0.000
 unw-0.000
!"-0.000
Token children
Token ID1751
Feature activation-2.51e-02

Pos logit contributions
 Their+0.068
 Some+0.067
ily+0.062
 many+0.060
 Everyone+0.059
Neg logit contributions
eling-0.134
ited-0.130
!-0.112
ivering-0.110
 unw-0.109
Token said
Token ID531
Feature activation-7.70e-03

Pos logit contributions
 Their+0.003
 Some+0.003
ily+0.003
year+0.003
tr+0.003
Neg logit contributions
eling-0.004
ited-0.003
!-0.003
 transported-0.003
ivering-0.003
Token,
Token ID11
Feature activation-8.60e-04

Pos logit contributions
 Their+0.013
 Some+0.013
arl+0.012
ily+0.012
itted+0.012
Neg logit contributions
eling-0.012
ited-0.011
!-0.010
 unw-0.009
 transported-0.009
Token "
Token ID366
Feature activation+2.87e-03

Pos logit contributions
arl+0.001
ests+0.001
itted+0.001
avy+0.001
utes+0.001
Neg logit contributions
!-0.002
eling-0.001
 unw-0.001
!"-0.001
ited-0.001
TokenIt
Token ID1026
Feature activation-4.77e-03

Pos logit contributions
ited+0.005
eling+0.005
!+0.004
 unw+0.004
 bit+0.004
Neg logit contributions
 Their-0.005
arl-0.004
 Some-0.004
avy-0.004
ily-0.004
Token's
Token ID338
Feature activation-1.39e-03

Pos logit contributions
 Their+0.009
 Some+0.009
 Everyone+0.009
ily+0.009
tr+0.009
Neg logit contributions
eling-0.007
ited-0.006
ivering-0.005
ge-0.004
!-0.004
Token still
Token ID991
Feature activation+4.80e-02

Pos logit contributions
 Their+0.002
 Some+0.002
ily+0.002
 lesson+0.002
arl+0.002
Neg logit contributions
eling-0.003
ited-0.003
!-0.002
ivering-0.002
 unw-0.002
Token a
Token ID257
Feature activation+1.65e-02

Pos logit contributions
eling+0.096
ited+0.090
!+0.078
 unw+0.073
ivering+0.073
Neg logit contributions
 Their-0.066
 Some-0.061
 lesson-0.058
ily-0.058
arl-0.057
Token fun
Token ID1257
Feature activation+1.57e-02

Pos logit contributions
eling+0.032
ited+0.030
!+0.028
 unw+0.025
 transported+0.024
Neg logit contributions
 Their-0.023
 Some-0.022
arl-0.020
ily-0.020
itted-0.020
Token toy
Token ID13373
Feature activation-2.56e-02

Pos logit contributions
eling+0.030
ited+0.028
!+0.026
 unw+0.023
 transported+0.022
Neg logit contributions
 Their-0.022
 Some-0.021
ily-0.020
arl-0.020
itted-0.019
Token to
Token ID284
Feature activation+1.99e-03

Pos logit contributions
 Their+0.048
 Some+0.046
ily+0.045
 Everyone+0.043
 lesson+0.043
Neg logit contributions
eling-0.040
ited-0.036
!-0.027
 unw-0.026
ivering-0.026
Token play
Token ID711
Feature activation-1.88e-03

Pos logit contributions
eling+0.003
ited+0.003
!+0.003
 transported+0.002
ivering+0.002
Neg logit contributions
 Their-0.003
 Some-0.003
ily-0.003
 Everyone-0.003
arl-0.003
Token loves
Token ID10408
Feature activation+1.86e-02

Pos logit contributions
eling+0.004
ited+0.003
!+0.003
 unw+0.003
 bit+0.003
Neg logit contributions
 Their-0.004
 Some-0.004
ily-0.004
arl-0.004
itted-0.004
Token children
Token ID1751
Feature activation-2.05e-02

Pos logit contributions
eling+0.026
!+0.026
ited+0.022
!"+0.021
 unw+0.019
Neg logit contributions
 Their-0.035
 Some-0.033
arl-0.032
year-0.032
itted-0.032
Token.
Token ID13
Feature activation+1.95e-02

Pos logit contributions
 Their+0.037
 Some+0.035
 many+0.034
ily+0.033
 Everyone+0.032
Neg logit contributions
eling-0.033
ited-0.032
 transported-0.024
 unw-0.024
ivering-0.022
Token You
Token ID921
Feature activation-2.50e-02

Pos logit contributions
!+0.033
eling+0.030
!"+0.030
!".+0.027
 unw+0.026
Neg logit contributions
 lesson-0.026
utes-0.025
itted-0.024
year-0.023
avy-0.023
Token can
Token ID460
Feature activation-1.10e-02

Pos logit contributions
 Their+0.038
 Some+0.037
 Everyone+0.034
ily+0.034
 lesson+0.033
Neg logit contributions
eling-0.046
ited-0.042
!-0.037
ivering-0.035
ries-0.035
Token pet
Token ID4273
Feature activation+6.64e-02

Pos logit contributions
 Their+0.020
 Some+0.019
arl+0.018
ily+0.018
itted+0.018
Neg logit contributions
eling-0.017
ited-0.015
!-0.014
 unw-0.012
 transported-0.011
Token him
Token ID683
Feature activation+1.31e-03

Pos logit contributions
eling+0.094
!+0.085
ited+0.081
!"+0.067
 unw+0.065
Neg logit contributions
 Their-0.126
 Some-0.119
itted-0.117
arl-0.117
year-0.117
Token if
Token ID611
Feature activation+3.70e-03

Pos logit contributions
eling+0.002
ited+0.002
!+0.001
 unw+0.001
 transported+0.001
Neg logit contributions
 Their-0.002
 Some-0.002
ily-0.002
arl-0.002
 lesson-0.002
Token you
Token ID345
Feature activation-3.69e-02

Pos logit contributions
eling+0.006
!+0.005
ited+0.005
 unw+0.004
!"+0.004
Neg logit contributions
 Their-0.006
 Some-0.006
arl-0.006
itted-0.006
year-0.006
Token want
Token ID765
Feature activation+4.22e-02

Pos logit contributions
 Their+0.056
ily+0.053
 Some+0.053
 Everyone+0.051
 lesson+0.050
Neg logit contributions
eling-0.065
ited-0.059
ivering-0.050
 transported-0.050
ries-0.050
Token."
Token ID526
Feature activation+3.58e-02

Pos logit contributions
eling+0.070
ited+0.064
!+0.060
 unw+0.052
 transported+0.049
Neg logit contributions
 Their-0.071
 Some-0.068
arl-0.064
ily-0.064
itted-0.063
Token it
Token ID340
Feature activation-2.31e-02

Pos logit contributions
eling+0.019
ited+0.019
!+0.016
 unw+0.014
ivering+0.014
Neg logit contributions
 Their-0.021
 Some-0.020
ily-0.019
itted-0.018
arl-0.018
Token safe
Token ID3338
Feature activation+5.19e-02

Pos logit contributions
 many+0.033
 lesson+0.032
ily+0.032
 Their+0.031
ot+0.029
Neg logit contributions
eling-0.043
ited-0.040
ries-0.031
ivering-0.029
 transported-0.029
Token.
Token ID13
Feature activation+5.69e-03

Pos logit contributions
eling+0.083
ited+0.077
 unw+0.058
!+0.056
 transported+0.056
Neg logit contributions
 Their-0.089
 Some-0.085
ily-0.085
 lesson-0.083
 many-0.082
Token Rose
Token ID8049
Feature activation-1.85e-02

Pos logit contributions
!+0.010
eling+0.009
ited+0.009
 unw+0.008
!"+0.008
Neg logit contributions
arl-0.008
 lesson-0.007
oys-0.007
itted-0.007
utes-0.007
Token thanked
Token ID26280
Feature activation+4.08e-02

Pos logit contributions
 Their+0.030
 Some+0.027
ily+0.027
 Everyone+0.026
 lesson+0.026
Neg logit contributions
eling-0.031
ited-0.028
!-0.025
ge-0.024
ries-0.024
Token the
Token ID262
Feature activation+1.17e-02

Pos logit contributions
eling+0.052
!+0.051
 unw+0.038
ited+0.035
 breath+0.034
Neg logit contributions
itted-0.082
 Some-0.082
tr-0.081
 Their-0.081
arl-0.081
Token man
Token ID582
Feature activation+3.53e-02

Pos logit contributions
eling+0.018
!+0.017
ited+0.016
 unw+0.014
 bit+0.014
Neg logit contributions
 Their-0.021
 Some-0.020
itted-0.019
arl-0.019
oys-0.018
Token,
Token ID11
Feature activation+4.95e-03

Pos logit contributions
eling+0.050
ited+0.046
!+0.043
 unw+0.036
 transported+0.033
Neg logit contributions
 Their-0.067
 Some-0.065
arl-0.062
ily-0.061
itted-0.060
Token put
Token ID1234
Feature activation+9.32e-03

Pos logit contributions
eling+0.007
ited+0.006
!+0.006
 unw+0.005
 transported+0.005
Neg logit contributions
 Their-0.009
 Some-0.009
arl-0.009
ily-0.009
itted-0.008
Token her
Token ID607
Feature activation+1.88e-02

Pos logit contributions
eling+0.013
ited+0.012
!+0.012
 unw+0.009
ries+0.008
Neg logit contributions
 Their-0.018
 Some-0.018
arl-0.017
oys-0.016
itted-0.016
Token hand
Token ID1021
Feature activation+8.15e-03

Pos logit contributions
eling+0.033
ited+0.031
!+0.027
ivering+0.024
 unw+0.024
Neg logit contributions
 Their-0.030
 Some-0.028
ily-0.027
 lesson-0.027
arl-0.026
Token and
Token ID290
Feature activation-1.94e-03

Pos logit contributions
eling+0.061
ited+0.057
!+0.047
 unw+0.043
ivering+0.042
Neg logit contributions
 Their-0.063
 Some-0.060
ily-0.057
arl-0.057
 lesson-0.055
Token splash
Token ID22870
Feature activation+4.19e-02

Pos logit contributions
 Their+0.004
 Some+0.004
 many+0.003
ily+0.003
arl+0.003
Neg logit contributions
ited-0.003
eling-0.002
ivering-0.002
!-0.002
 transported-0.002
Token into
Token ID656
Feature activation-9.72e-03

Pos logit contributions
eling+0.056
ited+0.050
!+0.046
 unw+0.038
 transported+0.036
Neg logit contributions
 Their-0.084
 Some-0.082
arl-0.077
ily-0.077
itted-0.076
Token the
Token ID262
Feature activation+3.87e-03

Pos logit contributions
 Their+0.016
 Some+0.016
ily+0.014
 many+0.014
 Everyone+0.014
Neg logit contributions
eling-0.017
ited-0.016
!-0.013
ivering-0.013
 transported-0.012
Token water
Token ID1660
Feature activation+2.89e-02

Pos logit contributions
eling+0.006
ited+0.005
!+0.005
 unw+0.004
ivering+0.004
Neg logit contributions
 Their-0.007
 Some-0.007
ily-0.006
itted-0.006
year-0.006
Token.
Token ID13
Feature activation+5.60e-03

Pos logit contributions
eling+0.050
ited+0.046
 transported+0.035
 unw+0.034
mie+0.032
Neg logit contributions
 Their-0.049
arl-0.046
 Some-0.046
ily-0.045
 many-0.045
Token The
Token ID383
Feature activation+1.04e-03

Pos logit contributions
!+0.010
 unw+0.009
!"+0.008
 bit+0.008
eling+0.008
Neg logit contributions
year-0.007
arl-0.006
avy-0.006
ative-0.006
utes-0.006
Token bear
Token ID6842
Feature activation+1.59e-02

Pos logit contributions
eling+0.002
ited+0.001
!+0.001
 mess+0.001
 unw+0.001
Neg logit contributions
 Their-0.002
 Some-0.002
itted-0.002
year-0.002
ily-0.002
Token follows
Token ID5679
Feature activation+1.31e-02

Pos logit contributions
ited+0.031
 avail+0.029
eling+0.028
ge+0.028
ivering+0.026
Neg logit contributions
 Their-0.022
 many-0.020
 Some-0.020
 lesson-0.019
 They-0.018
Token them
Token ID606
Feature activation-1.31e-02

Pos logit contributions
eling+0.018
!+0.017
ited+0.016
 unw+0.013
!"+0.013
Neg logit contributions
itted-0.024
 Some-0.024
arl-0.024
tr-0.024
 Their-0.024
Token.
Token ID13
Feature activation-1.15e-03

Pos logit contributions
 Their+0.018
 many+0.017
ily+0.016
 lesson+0.016
 Some+0.015
Neg logit contributions
eling-0.028
ited-0.027
 avail-0.021
ivering-0.021
ograph-0.021
Token really
Token ID1107
Feature activation+7.22e-03

Pos logit contributions
eling+0.016
!+0.014
ited+0.013
 struggling+0.012
 unw+0.012
Neg logit contributions
arl-0.021
oys-0.020
 Their-0.020
utes-0.020
 Some-0.019
Token brave
Token ID14802
Feature activation-2.75e-03

Pos logit contributions
eling+0.011
ited+0.009
!+0.009
 unw+0.007
 bit+0.007
Neg logit contributions
 Their-0.013
 Some-0.013
arl-0.012
oys-0.012
utes-0.012
Token and
Token ID290
Feature activation-1.55e-03

Pos logit contributions
 Some+0.005
 Their+0.005
ily+0.004
 many+0.004
itted+0.004
Neg logit contributions
eling-0.005
ited-0.004
 unw-0.003
ivering-0.003
ge-0.003
Token he
Token ID339
Feature activation+1.67e-02

Pos logit contributions
arl+0.003
utes+0.003
 Their+0.003
 Some+0.003
itted+0.003
Neg logit contributions
eling-0.002
!-0.002
ited-0.002
 bit-0.002
 unw-0.002
Token liked
Token ID8288
Feature activation+2.67e-02

Pos logit contributions
eling+0.034
ited+0.032
!+0.028
ge+0.027
ivering+0.027
Neg logit contributions
 Their-0.022
 Some-0.021
ily-0.020
 lesson-0.019
year-0.019
Token to
Token ID284
Feature activation+1.53e-03

Pos logit contributions
eling+0.049
!+0.045
ited+0.043
 unw+0.039
 transported+0.037
Neg logit contributions
 Their-0.039
arl-0.037
 Some-0.037
ily-0.036
itted-0.035
Token explore
Token ID7301
Feature activation+6.33e-02

Pos logit contributions
eling+0.002
ited+0.002
!+0.002
 unw+0.002
 transported+0.001
Neg logit contributions
 Their-0.003
 Some-0.003
arl-0.003
itted-0.003
year-0.003
Token!
Token ID0
Feature activation-2.67e-02

Pos logit contributions
ited+0.115
eling+0.114
ivering+0.089
 unw+0.086
 transported+0.086
Neg logit contributions
 Their-0.097
 many-0.096
 Some-0.094
 Everyone-0.089
ily-0.086
Token
Token ID220
Feature activation+1.47e-03

Pos logit contributions
year+0.038
arl+0.037
utes+0.037
itted+0.036
oys+0.035
Neg logit contributions
eling-0.044
!-0.043
ited-0.041
 unw-0.036
!"-0.036
Token
Token ID198
Feature activation-1.11e-02

Pos logit contributions
eling+0.002
!+0.002
ited+0.002
!"+0.002
 unw+0.002
Neg logit contributions
 Their-0.002
 Some-0.002
arl-0.002
year-0.002
itted-0.002
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
 Their+0.015
arl+0.014
 Some+0.014
itted+0.013
year+0.013
Neg logit contributions
eling-0.021
ited-0.020
!-0.019
 unw-0.017
!"-0.017
Token of
Token ID286
Feature activation-3.89e-02

Pos logit contributions
eling+0.051
ited+0.048
!+0.044
 unw+0.040
 transported+0.039
Neg logit contributions
 Their-0.032
 Some-0.030
ily-0.028
arl-0.028
itted-0.027
Token her
Token ID607
Feature activation+1.94e-02

Pos logit contributions
 Some+0.069
tr+0.068
itted+0.067
 Their+0.067
ometer+0.066
Neg logit contributions
eling-0.063
!-0.053
ited-0.044
 unw-0.042
 struggling-0.041
Token mom
Token ID1995
Feature activation-4.49e-03

Pos logit contributions
eling+0.036
ited+0.033
!+0.030
 unw+0.027
 transported+0.026
Neg logit contributions
 Their-0.029
 Some-0.028
ily-0.026
arl-0.026
itted-0.025
Token for
Token ID329
Feature activation-2.64e-02

Pos logit contributions
 Their+0.009
 Some+0.008
ily+0.008
year+0.008
arl+0.008
Neg logit contributions
eling-0.006
ited-0.006
!-0.005
 unw-0.004
 transported-0.004
Token helping
Token ID5742
Feature activation+4.11e-03

Pos logit contributions
 Some+0.055
 Their+0.054
arl+0.052
itted+0.051
 Everyone+0.051
Neg logit contributions
eling-0.035
!-0.028
ited-0.027
 unw-0.024
 struggling-0.022
Token her
Token ID607
Feature activation+1.97e-02

Pos logit contributions
eling+0.007
ited+0.006
!+0.006
 unw+0.005
 transported+0.005
Neg logit contributions
 Their-0.007
 Some-0.007
arl-0.006
itted-0.006
year-0.006
Token and
Token ID290
Feature activation+1.01e-03

Pos logit contributions
eling+0.036
ited+0.034
!+0.030
 unw+0.027
 transported+0.026
Neg logit contributions
 Their-0.031
 Some-0.029
ily-0.027
arl-0.027
itted-0.026
Token she
Token ID673
Feature activation+3.24e-02

Pos logit contributions
eling+0.001
!+0.001
ited+0.001
 unw+0.001
 transported+0.001
Neg logit contributions
 Some-0.002
 Their-0.002
arl-0.002
ily-0.002
itted-0.002
Token wasn
Token ID2492
Feature activation+2.25e-02

Pos logit contributions
eling+0.057
ited+0.053
!+0.047
 unw+0.042
ge+0.042
Neg logit contributions
 Their-0.052
 Some-0.048
ily-0.046
 lesson-0.045
 Everyone-0.045
Token't
Token ID470
Feature activation+3.31e-02

Pos logit contributions
eling+0.040
ited+0.037
!+0.033
 unw+0.029
 transported+0.029
Neg logit contributions
 Their-0.036
 Some-0.035
arl-0.032
ily-0.032
 Everyone-0.032
Token ashamed
Token ID22461
Feature activation+1.70e-02

Pos logit contributions
eling+0.052
ited+0.048
!+0.041
 unw+0.038
 transported+0.035
Neg logit contributions
 Their-0.058
 Some-0.056
ily-0.054
arl-0.052
itted-0.052
Token it
Token ID340
Feature activation-3.08e-02

Pos logit contributions
 Their+0.051
 Some+0.049
arl+0.048
oys+0.045
itted+0.045
Neg logit contributions
eling-0.055
ited-0.049
!-0.048
 unw-0.042
 transported-0.040
Token all
Token ID477
Feature activation-6.74e-03

Pos logit contributions
 many+0.046
 Their+0.046
 lesson+0.045
 Some+0.044
itted+0.042
Neg logit contributions
eling-0.056
ited-0.054
ivering-0.043
!-0.040
ries-0.040
Token the
Token ID262
Feature activation+1.29e-03

Pos logit contributions
 many+0.009
 Their+0.009
 lesson+0.009
 Everyone+0.009
 Some+0.008
Neg logit contributions
eling-0.013
ited-0.012
ivering-0.010
 unw-0.010
ries-0.010
Token way
Token ID835
Feature activation+6.55e-02

Pos logit contributions
eling+0.002
ited+0.002
!+0.002
 unw+0.002
ivering+0.002
Neg logit contributions
 Their-0.002
 Some-0.002
 lesson-0.002
ily-0.002
 Everyone-0.002
Token to
Token ID284
Feature activation-6.80e-03

Pos logit contributions
eling+0.137
ited+0.129
 unw+0.104
ivering+0.101
ries+0.098
Neg logit contributions
 Their-0.083
 many-0.080
itted-0.077
 Some-0.076
 lesson-0.074
Token the
Token ID262
Feature activation+1.28e-03

Pos logit contributions
 Their+0.013
 Some+0.013
arl+0.012
itted+0.012
ily+0.012
Neg logit contributions
eling-0.010
ited-0.009
!-0.008
 unw-0.007
 transported-0.006
Token top
Token ID1353
Feature activation+9.76e-03

Pos logit contributions
eling+0.002
ited+0.002
!+0.002
 unw+0.002
 transported+0.002
Neg logit contributions
 Their-0.002
 Some-0.002
arl-0.002
ily-0.002
itted-0.002
Token of
Token ID286
Feature activation-4.48e-02

Pos logit contributions
eling+0.018
ited+0.017
ivering+0.013
 unw+0.013
!+0.013
Neg logit contributions
 Their-0.014
 Some-0.014
 Everyone-0.013
ily-0.013
 many-0.012
Token the
Token ID262
Feature activation+2.19e-03

Pos logit contributions
 Their+0.079
 Some+0.076
ily+0.071
arl+0.071
itted+0.069
Neg logit contributions
eling-0.071
ited-0.066
!-0.058
 unw-0.051
ivering-0.050
Token cable
Token ID7862
Feature activation+3.25e-02

Pos logit contributions
eling+0.004
ited+0.004
!+0.004
 unw+0.003
ge+0.003
Neg logit contributions
 Their-0.003
 Some-0.003
 Everyone-0.003
 lesson-0.003
arl-0.003
Token and
Token ID290
Feature activation-1.35e-02

Pos logit contributions
eling+0.048
ited+0.043
ivering+0.031
 unw+0.030
ge+0.030
Neg logit contributions
 Their-0.061
 Some-0.058
 Everyone-0.057
 many-0.056
ily-0.056
Token started
Token ID2067
Feature activation-2.34e-02

Pos logit contributions
eling+0.046
ited+0.045
ivering+0.035
ge+0.034
 unw+0.034
Neg logit contributions
 Their-0.043
ily-0.040
 Some-0.038
 lesson-0.037
 many-0.037
Token spilling
Token ID45085
Feature activation+5.15e-03

Pos logit contributions
 Their+0.028
 Some+0.027
itted+0.023
ily+0.023
 lesson+0.023
Neg logit contributions
eling-0.052
ited-0.049
!-0.042
ivering-0.040
 unw-0.039
Token everywhere
Token ID8347
Feature activation+2.88e-02

Pos logit contributions
eling+0.009
ited+0.009
!+0.007
ivering+0.007
 unw+0.007
Neg logit contributions
 Their-0.008
 Some-0.008
ily-0.007
 many-0.007
arl-0.007
Token.
Token ID13
Feature activation-1.20e-02

Pos logit contributions
eling+0.039
ited+0.036
 unw+0.024
ivering+0.023
 transported+0.023
Neg logit contributions
 Their-0.059
ily-0.056
 Some-0.056
 many-0.053
 lesson-0.053
Token M
Token ID337
Feature activation+7.82e-03

Pos logit contributions
oys+0.020
arl+0.020
tr+0.020
year+0.018
 lesson+0.018
Neg logit contributions
!-0.017
eling-0.016
ited-0.014
 unw-0.013
 transported-0.013
Tokenandy
Token ID10757
Feature activation-5.73e-02

Pos logit contributions
eling+0.012
ited+0.011
!+0.010
 unw+0.008
ivering+0.008
Neg logit contributions
 Their-0.015
 Some-0.014
arl-0.013
year-0.013
ily-0.013
Token was
Token ID373
Feature activation+7.65e-03

Pos logit contributions
 Their+0.103
 Some+0.097
ily+0.095
 Everyone+0.090
 lesson+0.089
Neg logit contributions
eling-0.089
ited-0.082
!-0.067
ge-0.063
 unw-0.061
Token scared
Token ID12008
Feature activation+4.17e-02

Pos logit contributions
eling+0.014
ited+0.014
!+0.012
ge+0.011
 unw+0.011
Neg logit contributions
 Their-0.012
 Some-0.011
ily-0.011
 many-0.010
itted-0.010
Token and
Token ID290
Feature activation-1.54e-02

Pos logit contributions
eling+0.064
ited+0.058
!+0.046
 unw+0.045
 transported+0.043
Neg logit contributions
 Their-0.078
 Some-0.075
ily-0.072
arl-0.070
itted-0.069
Token felt
Token ID2936
Feature activation-3.00e-02

Pos logit contributions
 Their+0.029
 Some+0.027
ily+0.026
 Everyone+0.026
arl+0.026
Neg logit contributions
eling-0.022
ited-0.021
!-0.018
 unw-0.016
ge-0.015
Token bad
Token ID2089
Feature activation-2.61e-02

Pos logit contributions
 Their+0.046
 Some+0.045
ily+0.042
 Everyone+0.040
year+0.039
Neg logit contributions
eling-0.054
ited-0.052
!-0.045
 unw-0.043
ge-0.042
Token the
Token ID262
Feature activation+3.77e-03

Pos logit contributions
 Their+0.080
 Some+0.078
arl+0.073
ily+0.073
itted+0.071
Neg logit contributions
eling-0.076
ited-0.069
!-0.063
 unw-0.055
 transported-0.053
Token fun
Token ID1257
Feature activation+1.17e-02

Pos logit contributions
eling+0.007
ited+0.007
!+0.006
 unw+0.006
ivering+0.006
Neg logit contributions
 Some-0.005
 Their-0.005
ily-0.005
 lesson-0.005
 many-0.005
Token attractions
Token ID29156
Feature activation+1.08e-02

Pos logit contributions
eling+0.022
ited+0.022
 unw+0.018
ivering+0.018
 transported+0.017
Neg logit contributions
 Their-0.016
ily-0.015
 lesson-0.015
 Some-0.015
 Everyone-0.015
Token.
Token ID13
Feature activation-1.44e-03

Pos logit contributions
eling+0.016
ited+0.014
!+0.011
 unw+0.011
ivering+0.011
Neg logit contributions
 Their-0.021
 Some-0.020
ily-0.019
 Everyone-0.019
 lesson-0.018
Token They
Token ID1119
Feature activation-1.38e-02

Pos logit contributions
utes+0.002
arl+0.002
year+0.002
itted+0.002
avy+0.002
Neg logit contributions
!-0.003
ited-0.002
eling-0.002
 anymore-0.002
 unw-0.002
Token realized
Token ID6939
Feature activation-7.44e-02

Pos logit contributions
 Their+0.024
 Some+0.023
ily+0.021
arl+0.021
itted+0.021
Neg logit contributions
eling-0.023
ited-0.021
!-0.019
 unw-0.017
 transported-0.016
Token that
Token ID326
Feature activation-3.02e-02

Pos logit contributions
 Some+0.119
 Their+0.117
arl+0.117
itted+0.113
year+0.109
Neg logit contributions
eling-0.125
!-0.117
ited-0.108
 unw-0.092
 transported-0.091
Token the
Token ID262
Feature activation+4.44e-03

Pos logit contributions
 Their+0.049
 Some+0.048
arl+0.045
ily+0.044
itted+0.044
Neg logit contributions
eling-0.051
ited-0.047
!-0.043
 unw-0.038
 transported-0.037
Token park
Token ID3952
Feature activation+3.58e-02

Pos logit contributions
eling+0.007
ited+0.006
!+0.006
 unw+0.005
 transported+0.005
Neg logit contributions
 Their-0.008
 Some-0.008
arl-0.007
ily-0.007
year-0.007
Token was
Token ID373
Feature activation+1.30e-02

Pos logit contributions
eling+0.064
ited+0.060
ivering+0.047
 unw+0.047
!+0.047
Neg logit contributions
 Their-0.058
 lesson-0.057
 Some-0.054
ily-0.052
 Everyone-0.052
Token the
Token ID262
Feature activation+4.60e-03

Pos logit contributions
eling+0.026
ited+0.024
!+0.021
 unw+0.020
ivering+0.020
Neg logit contributions
 Their-0.018
 Some-0.017
 lesson-0.017
ily-0.016
 many-0.016
Token.
Token ID13
Feature activation-1.63e-02

Pos logit contributions
 Their+0.060
 Some+0.058
ily+0.055
arl+0.055
itted+0.055
Neg logit contributions
eling-0.045
ited-0.041
!-0.033
 unw-0.030
 transported-0.029
Token "
Token ID366
Feature activation+4.87e-03

Pos logit contributions
arl+0.024
 lesson+0.023
itted+0.023
tr+0.023
year+0.022
Neg logit contributions
!-0.027
eling-0.025
ited-0.022
 unw-0.022
!"-0.021
TokenThat
Token ID2504
Feature activation-4.25e-02

Pos logit contributions
ited+0.009
eling+0.009
!+0.007
 unw+0.007
 bit+0.007
Neg logit contributions
 Their-0.008
arl-0.007
utes-0.007
 Some-0.007
oys-0.007
Token is
Token ID318
Feature activation+1.70e-02

Pos logit contributions
 Their+0.082
 Some+0.082
 lesson+0.082
tr+0.080
 Everyone+0.077
Neg logit contributions
eling-0.059
ited-0.052
 unw-0.038
!-0.038
ge-0.037
Token why
Token ID1521
Feature activation+1.71e-02

Pos logit contributions
eling+0.031
ited+0.028
!+0.026
 unw+0.023
 transported+0.022
Neg logit contributions
 Their-0.027
 Some-0.026
arl-0.024
ily-0.024
itted-0.023
Token you
Token ID345
Feature activation-4.23e-02

Pos logit contributions
eling+0.026
ited+0.024
!+0.021
 unw+0.019
 transported+0.018
Neg logit contributions
 Their-0.031
 Some-0.030
ily-0.028
arl-0.028
itted-0.028
Token must
Token ID1276
Feature activation+2.41e-02

Pos logit contributions
 Their+0.074
 Some+0.071
arl+0.070
 Everyone+0.069
ily+0.068
Neg logit contributions
eling-0.063
ited-0.058
ries-0.046
 unw-0.045
 transported-0.043
Token be
Token ID307
Feature activation-7.53e-03

Pos logit contributions
eling+0.039
ited+0.038
!+0.030
ries+0.029
 transported+0.029
Neg logit contributions
 Their-0.041
ily-0.038
arl-0.038
 Some-0.037
 Everyone-0.037
Token careful
Token ID8161
Feature activation-1.40e-02

Pos logit contributions
 Their+0.013
 Some+0.012
ily+0.011
 many+0.011
itted+0.011
Neg logit contributions
eling-0.012
ited-0.012
!-0.010
 unw-0.009
ge-0.009
Token".
Token ID1911
Feature activation+2.39e-02

Pos logit contributions
 Their+0.027
 Some+0.026
ily+0.026
oys+0.026
arl+0.025
Neg logit contributions
eling-0.019
ited-0.018
 unw-0.013
 transported-0.012
!-0.012
Token Joe
Token ID5689
Feature activation+2.96e-02

Pos logit contributions
eling+0.043
!+0.040
ited+0.039
 unw+0.034
!"+0.033
Neg logit contributions
 Their-0.034
arl-0.033
 Some-0.033
ily-0.032
year-0.032
Token the
Token ID262
Feature activation+6.90e-03

Pos logit contributions
eling+0.020
ited+0.018
!+0.017
 unw+0.015
 transported+0.014
Neg logit contributions
 Their-0.022
 Some-0.021
arl-0.020
ily-0.020
itted-0.020
Token big
Token ID1263
Feature activation+1.94e-02

Pos logit contributions
eling+0.012
ited+0.011
!+0.011
 unw+0.010
 transported+0.009
Neg logit contributions
 Their-0.011
 Some-0.010
arl-0.009
ily-0.009
year-0.009
Token engine
Token ID3113
Feature activation-5.12e-02

Pos logit contributions
eling+0.039
ited+0.036
!+0.031
 unw+0.030
 transported+0.029
Neg logit contributions
 Their-0.027
 Some-0.026
arl-0.023
 lesson-0.023
 Everyone-0.023
Token.
Token ID13
Feature activation+7.97e-04

Pos logit contributions
 Their+0.101
 Some+0.098
ily+0.093
arl+0.092
itted+0.092
Neg logit contributions
eling-0.070
ited-0.063
!-0.050
 unw-0.045
 transported-0.043
Token She
Token ID1375
Feature activation+8.92e-03

Pos logit contributions
!+0.001
eling+0.001
ited+0.001
 unw+0.001
!"+0.001
Neg logit contributions
year-0.001
utes-0.001
avy-0.001
arl-0.001
oys-0.001
Token knew
Token ID2993
Feature activation-6.07e-02

Pos logit contributions
eling+0.016
ited+0.014
!+0.013
ivering+0.012
ge+0.011
Neg logit contributions
 Their-0.014
 Some-0.013
ily-0.013
 lesson-0.012
 Everyone-0.012
Token that
Token ID326
Feature activation-2.85e-02

Pos logit contributions
 Some+0.114
arl+0.113
 Their+0.112
itted+0.109
year+0.106
Neg logit contributions
eling-0.085
!-0.083
ited-0.066
 unw-0.063
 struggling-0.059
Token taking
Token ID2263
Feature activation-1.95e-02

Pos logit contributions
 Their+0.048
 Some+0.047
arl+0.045
ily+0.043
itted+0.043
Neg logit contributions
eling-0.047
ited-0.043
!-0.040
 unw-0.035
 transported-0.033
Token a
Token ID257
Feature activation+1.22e-02

Pos logit contributions
 Their+0.037
arl+0.036
 Some+0.035
tr+0.033
oys+0.033
Neg logit contributions
eling-0.027
!-0.026
ited-0.024
 unw-0.020
!"-0.019
Token bath
Token ID7837
Feature activation-3.45e-02

Pos logit contributions
eling+0.023
ited+0.022
!+0.020
 unw+0.018
 transported+0.018
Neg logit contributions
 Their-0.017
 Some-0.017
ily-0.016
arl-0.015
itted-0.015
Token was
Token ID373
Feature activation+1.16e-02

Pos logit contributions
 Their+0.059
 Some+0.058
arl+0.053
year+0.052
ily+0.052
Neg logit contributions
eling-0.055
ited-0.050
!-0.049
 unw-0.041
!"-0.039
Token was
Token ID373
Feature activation+3.60e-03

Pos logit contributions
 Their+0.098
ily+0.091
 Some+0.091
year+0.091
 Everyone+0.089
Neg logit contributions
eling-0.090
ited-0.083
!-0.068
 unw-0.066
ivering-0.066
Token amazed
Token ID24562
Feature activation+3.84e-02

Pos logit contributions
eling+0.007
ited+0.006
ries+0.005
 unw+0.005
 tastes+0.005
Neg logit contributions
 Their-0.006
 many-0.005
 Some-0.005
itted-0.005
ily-0.005
Token!
Token ID0
Feature activation-4.27e-02

Pos logit contributions
eling+0.070
ited+0.060
ivering+0.049
 unw+0.049
 transported+0.047
Neg logit contributions
 Their-0.057
itted-0.054
 many-0.052
 Everyone-0.052
oys-0.052
Token
Token ID198
Feature activation-2.11e-02

Pos logit contributions
tr+0.060
arl+0.059
year+0.058
 lesson+0.056
 Their+0.056
Neg logit contributions
!-0.070
eling-0.066
ited-0.061
 unw-0.058
 transported-0.055
Token
Token ID198
Feature activation-1.97e-02

Pos logit contributions
 Their+0.029
 Some+0.028
arl+0.028
tr+0.028
itted+0.027
Neg logit contributions
eling-0.038
!-0.036
ited-0.034
 unw-0.031
!"-0.029
TokenJ
Token ID41
Feature activation-8.23e-02

Pos logit contributions
 Their+0.023
 Some+0.021
arl+0.020
tr+0.019
itted+0.019
Neg logit contributions
eling-0.043
ited-0.041
!-0.038
 unw-0.035
 transported-0.034
Tokenenny
Token ID11870
Feature activation-8.62e-02

Pos logit contributions
 Their+0.090
 Some+0.086
arl+0.075
 lesson+0.071
ily+0.071
Neg logit contributions
eling-0.183
ited-0.175
!-0.167
 unw-0.150
 transported-0.145
Token shouted
Token ID17293
Feature activation-2.93e-02

Pos logit contributions
 Their+0.140
 Some+0.130
 Everyone+0.129
ily+0.128
itted+0.125
Neg logit contributions
eling-0.145
ited-0.131
ivering-0.105
ries-0.104
!-0.103
Token with
Token ID351
Feature activation+4.42e-03

Pos logit contributions
 Their+0.054
 Some+0.052
ily+0.049
arl+0.049
 Everyone+0.048
Neg logit contributions
eling-0.045
ited-0.041
!-0.035
 unw-0.031
ivering-0.030
Token joy
Token ID8716
Feature activation+9.97e-03

Pos logit contributions
eling+0.008
ited+0.007
!+0.006
 unw+0.006
 transported+0.005
Neg logit contributions
 Their-0.007
 Some-0.007
arl-0.006
ily-0.006
itted-0.006
Token as
Token ID355
Feature activation+1.05e-03

Pos logit contributions
eling+0.013
ited+0.012
!+0.009
 unw+0.008
 transported+0.008
Neg logit contributions
 Their-0.020
 Some-0.019
arl-0.019
ily-0.019
itted-0.019
Token
Token ID198
Feature activation-1.49e-02

Pos logit contributions
arl+0.014
oys+0.013
utes+0.013
 lesson+0.012
itted+0.012
Neg logit contributions
!-0.018
eling-0.017
 unw-0.016
ited-0.016
!"-0.015
Token
Token ID198
Feature activation-1.41e-02

Pos logit contributions
arl+0.022
itted+0.022
 Their+0.021
 Some+0.021
oys+0.020
Neg logit contributions
!-0.024
eling-0.023
ited-0.022
!"-0.020
 unw-0.020
TokenA
Token ID32
Feature activation-3.24e-02

Pos logit contributions
 Their+0.019
arl+0.019
 Some+0.019
itted+0.018
 lesson+0.017
Neg logit contributions
eling-0.025
ited-0.024
!-0.024
 unw-0.021
ivering-0.020
Token 3
Token ID513
Feature activation-5.82e-02

Pos logit contributions
itted+0.063
 Their+0.062
 Some+0.060
arl+0.057
ests+0.056
Neg logit contributions
!-0.041
 bit-0.039
ited-0.038
eling-0.037
 mess-0.032
Token-
Token ID12
Feature activation-7.80e-02

Pos logit contributions
 Their+0.116
 Some+0.112
arl+0.108
itted+0.105
ily+0.104
Neg logit contributions
eling-0.079
ited-0.072
!-0.065
 unw-0.052
ivering-0.050
Tokenyear
Token ID1941
Feature activation-1.84e-01

Pos logit contributions
arl+0.087
 Their+0.086
utes+0.076
robe+0.073
oys+0.072
Neg logit contributions
ited-0.158
eling-0.156
!-0.141
bug-0.137
 seen-0.133
Token-
Token ID12
Feature activation-7.76e-02

Pos logit contributions
 Their+0.404
 Some+0.394
tr+0.373
arl+0.372
utes+0.362
Neg logit contributions
eling-0.184
!-0.175
ited-0.172
!"-0.122
 breath-0.115
Tokenold
Token ID727
Feature activation-7.10e-03

Pos logit contributions
arl+0.162
 Their+0.156
robe+0.146
ula+0.143
tr+0.143
Neg logit contributions
eling-0.089
ited-0.088
!-0.077
 seen-0.062
 years-0.059
Token toddler
Token ID30773
Feature activation+1.80e-02

Pos logit contributions
 Their+0.010
arl+0.010
 Some+0.010
itted+0.009
utes+0.009
Neg logit contributions
eling-0.013
ited-0.012
!-0.012
 unw-0.010
 bit-0.010
Token walked
Token ID6807
Feature activation-1.60e-02

Pos logit contributions
eling+0.037
ited+0.034
ivering+0.027
!+0.026
ge+0.026
Neg logit contributions
 Their-0.026
 Some-0.024
ily-0.024
year-0.022
 Everyone-0.021
Token up
Token ID510
Feature activation-4.36e-03

Pos logit contributions
 Their+0.033
 Some+0.032
arl+0.031
ily+0.030
itted+0.030
Neg logit contributions
eling-0.019
!-0.019
ited-0.018
 unw-0.014
!"-0.013
Token
Token ID198
Feature activation-1.19e-02

Pos logit contributions
itted+0.020
year+0.019
 Some+0.019
arl+0.019
 Their+0.019
Neg logit contributions
!-0.021
eling-0.021
!"-0.018
ited-0.017
 unw-0.016
TokenJen
Token ID44875
Feature activation-4.94e-03

Pos logit contributions
 Some+0.015
itted+0.015
 Their+0.015
arl+0.015
year+0.014
Neg logit contributions
eling-0.023
!-0.022
ited-0.021
!"-0.018
 unw-0.018
Token feels
Token ID5300
Feature activation-1.52e-02

Pos logit contributions
 Their+0.008
 Some+0.007
ily+0.007
 lesson+0.007
 Everyone+0.007
Neg logit contributions
eling-0.009
ited-0.008
!-0.007
ge-0.007
ivering-0.006
Token even
Token ID772
Feature activation+1.27e-02

Pos logit contributions
 Their+0.024
 Some+0.023
ily+0.022
 Everyone+0.021
 lesson+0.020
Neg logit contributions
eling-0.027
ited-0.026
!-0.023
 unw-0.021
ge-0.021
Token sad
Token ID6507
Feature activation-1.32e-02

Pos logit contributions
eling+0.021
ited+0.019
!+0.018
 unw+0.016
ivering+0.015
Neg logit contributions
 Their-0.021
 Some-0.020
ily-0.019
year-0.018
arl-0.018
Tokender
Token ID1082
Feature activation-1.55e-01

Pos logit contributions
 Their+0.030
 Some+0.029
year+0.028
arl+0.028
itted+0.028
Neg logit contributions
eling-0.014
!-0.012
ited-0.011
 unw-0.008
!"-0.008
Token.
Token ID13
Feature activation+5.97e-03

Pos logit contributions
 Their+0.358
 Some+0.350
arl+0.335
itted+0.329
 Everyone+0.327
Neg logit contributions
eling-0.164
ited-0.138
!-0.132
 unw-0.087
ivering-0.086
Token She
Token ID1375
Feature activation+1.11e-02

Pos logit contributions
!+0.009
eling+0.008
 unw+0.007
!"+0.007
ited+0.007
Neg logit contributions
year-0.010
 lesson-0.009
aid-0.009
utes-0.009
arl-0.008
Token thinks
Token ID6834
Feature activation-2.75e-02

Pos logit contributions
eling+0.020
ited+0.019
!+0.017
ge+0.016
ivering+0.015
Neg logit contributions
 Their-0.017
 Some-0.016
ily-0.015
 Everyone-0.015
 lesson-0.015
Token Tim
Token ID5045
Feature activation+3.81e-02

Pos logit contributions
 Their+0.046
arl+0.046
itted+0.046
 Some+0.045
tr+0.045
Neg logit contributions
eling-0.041
!-0.039
ited-0.032
 unw-0.031
ries-0.030
Token is
Token ID318
Feature activation+1.98e-02

Pos logit contributions
eling+0.070
ited+0.066
 transported+0.062
ge+0.060
ivering+0.059
Neg logit contributions
 Their-0.051
ily-0.050
 lesson-0.047
 Some-0.046
 They-0.043
TokenJohn
Token ID7554
Feature activation+2.14e-02

Pos logit contributions
 Their+0.016
 Some+0.015
ily+0.014
year+0.013
 Everyone+0.013
Neg logit contributions
eling-0.024
ited-0.022
!-0.020
 unw-0.019
 transported-0.018
Token said
Token ID531
Feature activation-4.16e-03

Pos logit contributions
eling+0.033
ited+0.030
!+0.025
ivering+0.023
 transported+0.022
Neg logit contributions
 Their-0.039
 Some-0.037
ily-0.035
itted-0.035
year-0.034
Token,
Token ID11
Feature activation-3.77e-03

Pos logit contributions
 Their+0.009
arl+0.009
 Some+0.009
tr+0.009
oys+0.009
Neg logit contributions
!-0.004
eling-0.004
ited-0.003
!"-0.003
 transported-0.003
Token "
Token ID366
Feature activation-2.28e-03

Pos logit contributions
 Some+0.005
ests+0.005
arl+0.005
 Their+0.005
ily+0.005
Neg logit contributions
!-0.007
eling-0.006
 unw-0.005
!"-0.005
ited-0.005
TokenCare
Token ID17784
Feature activation+1.07e-02

Pos logit contributions
 Their+0.003
arl+0.003
 lesson+0.003
ometer+0.003
 Some+0.003
Neg logit contributions
ited-0.004
eling-0.004
!-0.004
 unw-0.003
 bit-0.003
Tokenful
Token ID913
Feature activation-1.29e-01

Pos logit contributions
eling+0.024
ited+0.022
!+0.022
 unw+0.019
!"+0.019
Neg logit contributions
 Their-0.012
 Some-0.011
ily-0.010
 lesson-0.010
itted-0.010
Token,
Token ID11
Feature activation+6.09e-04

Pos logit contributions
 Their+0.319
 Some+0.310
ily+0.295
arl+0.295
year+0.293
Neg logit contributions
eling-0.113
ited-0.093
!-0.088
 unw-0.056
!"-0.052
Token Mom
Token ID11254
Feature activation+1.73e-02

Pos logit contributions
eling+0.001
ited+0.001
!+0.001
 unw+0.001
 transported+0.001
Neg logit contributions
 Their-0.001
 Some-0.001
arl-0.001
ily-0.001
itted-0.001
Tokenmy
Token ID1820
Feature activation-1.37e-02

Pos logit contributions
eling+0.029
ited+0.027
 transported+0.021
 unw+0.021
!+0.021
Neg logit contributions
 Their-0.028
 Some-0.028
oys-0.026
arl-0.026
ily-0.026
Token!"
Token ID2474
Feature activation-2.09e-02

Pos logit contributions
oys+0.029
 Their+0.029
ily+0.028
arl+0.028
 Some+0.027
Neg logit contributions
eling-0.016
ited-0.014
 transported-0.009
 unw-0.008
ivering-0.008
Token
Token ID198
Feature activation-9.14e-03

Pos logit contributions
arl+0.028
 Their+0.028
ily+0.028
 Some+0.027
utes+0.027
Neg logit contributions
eling-0.037
!-0.036
ited-0.034
 unw-0.030
!"-0.029
Token They
Token ID1119
Feature activation-2.26e-02

Pos logit contributions
year+0.018
avy+0.017
 lesson+0.016
ative+0.016
ictionary+0.015
Neg logit contributions
!-0.017
 unw-0.016
!"-0.014
.-0.014
 or-0.013
Token said
Token ID531
Feature activation-8.35e-03

Pos logit contributions
 Their+0.043
 Some+0.042
ily+0.039
arl+0.039
itted+0.038
Neg logit contributions
eling-0.033
ited-0.029
!-0.026
 unw-0.022
ivering-0.022
Token,
Token ID11
Feature activation-1.27e-02

Pos logit contributions
tr+0.018
arl+0.018
itted+0.017
year+0.017
oys+0.017
Neg logit contributions
!-0.008
eling-0.007
!"-0.006
 unw-0.006
,-0.006
Token "
Token ID366
Feature activation-4.64e-03

Pos logit contributions
ily+0.019
 lesson+0.019
avy+0.019
itted+0.019
ests+0.019
Neg logit contributions
!-0.019
 unw-0.016
eling-0.015
!"-0.015
 smiling-0.014
TokenAr
Token ID3163
Feature activation-4.66e-02

Pos logit contributions
 Their+0.007
arl+0.006
 Some+0.006
year+0.006
 lesson+0.006
Neg logit contributions
eling-0.008
ited-0.008
!-0.007
 unw-0.007
 bit-0.007
Tokenrr
Token ID21062
Feature activation-1.41e-01

Pos logit contributions
 Their+0.035
 Some+0.032
arl+0.031
ily+0.030
oys+0.030
Neg logit contributions
eling-0.118
ited-0.113
!-0.104
 unw-0.098
 transported-0.097
Token,
Token ID11
Feature activation-7.25e-03

Pos logit contributions
 Their+0.257
oys+0.252
arl+0.249
 Some+0.247
ily+0.239
Neg logit contributions
eling-0.215
ited-0.193
 unw-0.145
 transported-0.137
ivering-0.134
Token mate
Token ID16133
Feature activation-6.41e-02

Pos logit contributions
itted+0.014
 Their+0.014
 Some+0.013
year+0.013
arl+0.013
Neg logit contributions
!-0.009
eling-0.008
 unw-0.008
!"-0.008
 bit-0.007
Tokeny
Token ID88
Feature activation-2.52e-02

Pos logit contributions
 Their+0.099
 Some+0.091
oys+0.088
ily+0.088
arl+0.088
Neg logit contributions
eling-0.120
ited-0.112
 unw-0.093
 transported-0.089
!-0.087
Token,
Token ID11
Feature activation-5.60e-03

Pos logit contributions
oys+0.051
 Their+0.051
 Everyone+0.049
ily+0.048
 Some+0.048
Neg logit contributions
eling-0.034
ited-0.031
ivering-0.021
 transported-0.019
 unw-0.019
Token we
Token ID356
Feature activation+5.51e-03

Pos logit contributions
 Their+0.012
itted+0.011
 Some+0.011
year+0.011
arl+0.011
Neg logit contributions
eling-0.006
!-0.006
 unw-0.005
!"-0.005
ited-0.005
Token so
Token ID523
Feature activation-2.55e-02

Pos logit contributions
eling+0.026
ited+0.024
!+0.021
 unw+0.019
ge+0.018
Neg logit contributions
 Their-0.026
 Some-0.025
ily-0.023
year-0.023
itted-0.023
Token happy
Token ID3772
Feature activation-1.70e-03

Pos logit contributions
 Their+0.047
 Some+0.047
oys+0.045
arl+0.044
utes+0.043
Neg logit contributions
eling-0.037
ited-0.034
!-0.031
 unw-0.026
ivering-0.025
Token that
Token ID326
Feature activation-3.15e-02

Pos logit contributions
 Their+0.003
 Some+0.002
ily+0.002
 Everyone+0.002
oys+0.002
Neg logit contributions
eling-0.003
ited-0.003
 unw-0.002
ivering-0.002
 transported-0.002
Token Ph
Token ID1380
Feature activation+7.17e-02

Pos logit contributions
 Their+0.049
 Some+0.047
ily+0.044
arl+0.044
itted+0.043
Neg logit contributions
eling-0.056
ited-0.052
!-0.047
 unw-0.042
 transported-0.041
Tokenoe
Token ID2577
Feature activation+2.58e-02

Pos logit contributions
eling+0.104
ited+0.094
!+0.083
 unw+0.073
 transported+0.069
Neg logit contributions
 Their-0.135
 Some-0.130
arl-0.126
ily-0.125
itted-0.123
Tokenbe
Token ID1350
Feature activation-1.20e-01

Pos logit contributions
eling+0.049
ited+0.045
!+0.041
 unw+0.037
 transported+0.036
Neg logit contributions
 Their-0.037
 Some-0.035
ily-0.033
itted-0.032
year-0.032
Token was
Token ID373
Feature activation+1.38e-02

Pos logit contributions
 Their+0.213
 Some+0.207
arl+0.199
ily+0.190
itted+0.188
Neg logit contributions
eling-0.192
ited-0.174
!-0.160
 unw-0.132
ivering-0.129
Token so
Token ID523
Feature activation-2.72e-02

Pos logit contributions
eling+0.022
ited+0.020
!+0.018
 unw+0.016
 transported+0.015
Neg logit contributions
 Their-0.024
 Some-0.024
arl-0.022
ily-0.022
itted-0.021
Token clever
Token ID14169
Feature activation-5.97e-03

Pos logit contributions
 Their+0.048
 Some+0.048
oys+0.046
tr+0.046
utes+0.045
Neg logit contributions
eling-0.040
ited-0.036
!-0.035
 unw-0.029
 transported-0.028
Token and
Token ID290
Feature activation-6.79e-03

Pos logit contributions
 Their+0.011
 Some+0.011
ily+0.011
arl+0.011
 Everyone+0.011
Neg logit contributions
eling-0.009
ited-0.008
 unw-0.006
!-0.006
ivering-0.006
Token forgiving
Token ID43486
Feature activation-1.81e-02

Pos logit contributions
 Some+0.012
 Their+0.012
arl+0.012
utes+0.011
ily+0.011
Neg logit contributions
eling-0.010
!-0.009
ited-0.009
 unw-0.008
 bit-0.007