TOP ACTIVATIONS
MAX = 10.923

much fun ! The princess couldn 't stop exploring the enchant
ever before !" The rabbit couldn 't believe it . He
rubber was huge ! He couldn 't wait to try it
The little three year old couldn 't help but notice a
a grape . But he couldn 't reach , so his
â The crab couldn â t believe
was so excited ! He couldn 't wait to show his
her new rug that she couldn 't stop smiling . She
. The girl couldn 't believe it ! She
the music man said he couldn 't stay . Everyone was
slide ! Y ol anda couldn 't believe it . She
her room . But she couldn 't find it because her
M agg ie couldn 't help but smile .
the bed - but he couldn 't find it .
Benny was very restless and couldn 't sit still . He
my was sad because he couldn 't drink his juice .
. Kat y couldn â t wait
! Even the sour fruits couldn 't ruin their fun !
. She was deaf and couldn 't hear very well .
and again , but it couldn 't make the nest stay

MIN ACTIVATIONS
MIN = -14.178

that day on , Cr ust y tried his best to
. One day , Cr ust y found a shiny shell
Cr ust y . Cr ust y lived in a big
answer for him . Fr ust rated , he decided to
back . Fr ust rated , Jane went back
stick . Fr ust rated , the lion went
or too hard . Fr ust rated , she kept searching
was too dangerous . Fr ust rated , Sam turned away
a little crab named Cr ust y . Cr ust y
but couldn 't . Fr ust rated , he started to
around the door . Fr ust rated , Jack put the
ge . Fr ust rated , the little ant
print so badly . Fr ust rated tears were well ing
was too strong . Fr ust rated , he started to
he said no . Fr ust rated , Bob decided to
. Can you say five kil os ?" uncle Tom said
" They weigh five kil os . That means five
. " Five kil os ," Anna repeated .
the bald monkey weighed 4 kil os . The monkey was
, being surrounded by the tranqu ility of nature .

INTERVAL 4.648 - 10.923
CONTAINS 4.741%

, we are happy as can be ." Then
saw a big dog and wanted to pet it , but
opened the box . They found many toys inside . There
exploring , when he suddenly heard a loud noise . It
day ever ! <|endoftext|> Once upon a time , there was

INTERVAL -1.628 - 4.648
CONTAINS 66.379%

, showing off her dress . Then she went to show
weak . Even the straw was fragile and the stable began
, Tom , I have a new pen . It is
ily had so much fun jumping and rolling around on the
One day , Lily 's dad brought home a surprise for

INTERVAL -7.903 - -1.628
CONTAINS 27.042%

explore the world around him . So he decided to march
L ily looked at her mom and then at her
â t decide where to go so he asked
made a feast for the animals !" S ara
thin dog looked sad . The fat cat wanted

INTERVAL -14.178 - -7.903
CONTAINS 1.837%

, Lily ." And they all went home happy . <|endoftext|>
mom . I love you too . You are right .
It was filled with ch ocol ates and coins . She
things and keep them for too long . I should respect
and slow . They bounce with joy and laugh .
Token much
Token ID881
Feature activation-6.33e+00

Pos logit contributions
ust+0.032
irk+0.029
oker+0.028
 mind+0.027
 forward+0.027
Neg logit contributions
lf-0.008
 parked-0.005
 airplane-0.005
 Year-0.004
 set-0.004
Token fun
Token ID1257
Feature activation-3.27e+00

Pos logit contributions
ust+0.199
irk+0.168
apping+0.153
 forward+0.153
oker+0.153
Neg logit contributions
lf-0.169
 then-0.150
 airplane-0.148
 cardboard-0.148
 set-0.145
Token!
Token ID0
Feature activation-4.68e-01

Pos logit contributions
ust+0.113
irk+0.096
 forward+0.087
olding+0.086
umble+0.085
Neg logit contributions
lf-0.089
 airplane-0.071
 set-0.068
 then-0.068
 Year-0.067
Token The
Token ID383
Feature activation+6.59e+00

Pos logit contributions
ust+0.014
irk+0.012
oker+0.011
 forward+0.011
 mind+0.011
Neg logit contributions
lf-0.013
 airplane-0.011
 parked-0.011
 Year-0.011
 cardboard-0.011
Token princess
Token ID21752
Feature activation-8.53e-01

Pos logit contributions
lf+0.159
 parked+0.126
 Year+0.126
 airplane+0.123
 cardboard+0.118
Neg logit contributions
ust-0.212
irk-0.183
 mind-0.180
oker-0.177
 forward-0.175
Token couldn
Token ID3521
Feature activation+9.76e+00

Pos logit contributions
ust+0.032
irk+0.028
oker+0.027
 mind+0.026
 forward+0.026
Neg logit contributions
lf-0.017
 airplane-0.013
 parked-0.013
 Year-0.013
 cardboard-0.013
Token't
Token ID470
Feature activation-2.16e+00

Pos logit contributions
lf+0.162
 parked+0.117
 airplane+0.115
 set+0.110
 then+0.110
Neg logit contributions
ust-0.414
irk-0.374
oker-0.348
 forward-0.342
osa-0.341
Token stop
Token ID2245
Feature activation-5.92e+00

Pos logit contributions
ust+0.063
irk+0.053
oker+0.050
 forward+0.048
apping+0.047
Neg logit contributions
lf-0.064
 airplane-0.054
 then-0.054
 parked-0.053
 set-0.052
Token exploring
Token ID13504
Feature activation-4.44e-01

Pos logit contributions
ust+0.226
oker+0.192
irk+0.188
olding+0.174
 loser+0.174
Neg logit contributions
lf-0.124
 airplane-0.116
 then-0.116
 set-0.108
 walls-0.106
Token the
Token ID262
Feature activation+2.45e+00

Pos logit contributions
ust+0.017
irk+0.015
oker+0.014
aring+0.014
olding+0.014
Neg logit contributions
lf-0.010
 airplane-0.008
 cardboard-0.008
 then-0.008
 walls-0.008
Token enchant
Token ID23260
Feature activation-3.83e+00

Pos logit contributions
lf+0.052
 airplane+0.040
 parked+0.040
 Year+0.039
 cardboard+0.039
Neg logit contributions
ust-0.091
irk-0.078
oker-0.075
 forward-0.074
 mind-0.074
Token ever
Token ID1683
Feature activation-6.04e+00

Pos logit contributions
lf+0.044
 airplane+0.036
 Year+0.035
 parked+0.035
 cardboard+0.034
Neg logit contributions
ust-0.048
irk-0.040
 mind-0.039
oker-0.038
 possible-0.038
Token before
Token ID878
Feature activation-2.10e-01

Pos logit contributions
ust+0.184
irk+0.159
oker+0.150
olding+0.142
istered+0.142
Neg logit contributions
 then-0.153
lf-0.150
 set-0.148
 airplane-0.135
 parked-0.133
Token!"
Token ID2474
Feature activation+4.20e+00

Pos logit contributions
ust+0.007
irk+0.006
oker+0.006
 forward+0.006
olding+0.006
Neg logit contributions
lf-0.005
 then-0.004
 set-0.004
 parked-0.004
 airplane-0.004
Token The
Token ID383
Feature activation+6.62e+00

Pos logit contributions
lf+0.092
 then+0.083
 parked+0.073
 set+0.072
 Year+0.071
Neg logit contributions
ust-0.157
irk-0.130
 possible-0.126
oker-0.126
 unnecessary-0.122
Token rabbit
Token ID22746
Feature activation+6.45e+00

Pos logit contributions
lf+0.162
 parked+0.128
 airplane+0.127
 Year+0.127
 cardboard+0.124
Neg logit contributions
ust-0.219
irk-0.187
oker-0.180
 mind-0.179
 forward-0.177
Token couldn
Token ID3521
Feature activation+9.76e+00

Pos logit contributions
lf+0.125
 then+0.097
 parked+0.095
 airplane+0.094
 set+0.094
Neg logit contributions
ust-0.254
irk-0.221
oker-0.216
 possible-0.214
 forward-0.210
Token't
Token ID470
Feature activation-2.00e+00

Pos logit contributions
lf+0.128
 then+0.089
 parked+0.088
 set+0.085
 store+0.079
Neg logit contributions
ust-0.446
irk-0.408
ashing-0.377
oker-0.376
osa-0.372
Token believe
Token ID1975
Feature activation-4.75e+00

Pos logit contributions
ust+0.060
irk+0.051
oker+0.049
elly+0.046
 forward+0.046
Neg logit contributions
lf-0.057
 then-0.051
 set-0.049
 airplane-0.048
 parked-0.047
Token it
Token ID340
Feature activation-4.80e-01

Pos logit contributions
ust+0.176
irk+0.152
oker+0.147
umpy+0.142
ashes+0.142
Neg logit contributions
lf-0.108
 airplane-0.083
 then-0.081
 parked-0.078
 Year-0.077
Token.
Token ID13
Feature activation-1.01e+00

Pos logit contributions
ust+0.014
irk+0.011
 forward+0.011
 possible+0.010
 mind+0.010
Neg logit contributions
lf-0.014
 Year-0.012
 airplane-0.012
 parked-0.012
 cardboard-0.012
Token He
Token ID679
Feature activation+3.69e+00

Pos logit contributions
ust+0.035
irk+0.029
oker+0.028
 possible+0.028
 forward+0.027
Neg logit contributions
lf-0.024
 parked-0.020
 Year-0.019
 then-0.019
 airplane-0.019
Token rubber
Token ID14239
Feature activation+5.21e+00

Pos logit contributions
lf+0.204
 parked+0.160
 Year+0.157
 airplane+0.155
 cardboard+0.149
Neg logit contributions
ust-0.290
irk-0.249
 mind-0.243
oker-0.240
 forward-0.238
Token was
Token ID373
Feature activation+2.99e+00

Pos logit contributions
lf+0.116
 Year+0.090
 airplane+0.089
 parked+0.088
 cardboard+0.086
Neg logit contributions
ust-0.187
irk-0.161
oker-0.156
 forward-0.154
 mind-0.154
Token huge
Token ID3236
Feature activation+3.69e+00

Pos logit contributions
lf+0.082
 parked+0.070
 airplane+0.069
 then+0.069
 cardboard+0.068
Neg logit contributions
ust-0.094
irk-0.080
oker-0.077
 forward-0.071
umble-0.070
Token!
Token ID0
Feature activation-6.12e-01

Pos logit contributions
lf+0.082
 airplane+0.067
 cardboard+0.065
 parked+0.064
 Year+0.063
Neg logit contributions
ust-0.135
irk-0.113
oker-0.110
 forward-0.106
 possible-0.106
Token He
Token ID679
Feature activation+4.35e+00

Pos logit contributions
ust+0.021
irk+0.017
oker+0.017
 possible+0.016
 forward+0.016
Neg logit contributions
lf-0.015
 parked-0.012
 airplane-0.012
 Year-0.012
 then-0.012
Token couldn
Token ID3521
Feature activation+9.74e+00

Pos logit contributions
lf+0.087
 airplane+0.067
 Year+0.066
 cardboard+0.064
 parked+0.063
Neg logit contributions
ust-0.165
irk-0.141
oker-0.137
 mind-0.137
 forward-0.135
Token't
Token ID470
Feature activation-2.04e+00

Pos logit contributions
lf+0.152
 parked+0.113
 then+0.107
 set+0.104
 airplane+0.104
Neg logit contributions
ust-0.423
irk-0.382
oker-0.352
osa-0.349
ashing-0.348
Token wait
Token ID4043
Feature activation-4.16e+00

Pos logit contributions
ust+0.060
irk+0.051
oker+0.048
 forward+0.047
aring+0.046
Neg logit contributions
lf-0.059
 then-0.052
 airplane-0.051
 set-0.050
 parked-0.050
Token to
Token ID284
Feature activation+1.63e+00

Pos logit contributions
ust+0.122
irk+0.102
oker+0.097
umpy+0.091
 loser+0.090
Neg logit contributions
lf-0.120
 then-0.105
 airplane-0.105
 set-0.103
 cardboard-0.102
Token try
Token ID1949
Feature activation+2.09e+00

Pos logit contributions
lf+0.041
 airplane+0.033
 parked+0.033
 cardboard+0.033
 set+0.032
Neg logit contributions
ust-0.056
irk-0.047
oker-0.045
 forward-0.044
olding-0.043
Token it
Token ID340
Feature activation-7.37e-01

Pos logit contributions
lf+0.038
 airplane+0.029
 then+0.028
 cardboard+0.027
 set+0.026
Neg logit contributions
ust-0.088
irk-0.078
oker-0.077
olding-0.073
umpy-0.073
TokenThe
Token ID464
Feature activation+8.56e+00

Pos logit contributions
lf+0.082
 airplane+0.065
 parked+0.064
 Year+0.064
 cardboard+0.063
Neg logit contributions
ust-0.119
irk-0.101
oker-0.097
 forward-0.095
 possible-0.095
Token little
Token ID1310
Feature activation+6.18e+00

Pos logit contributions
lf+0.204
 parked+0.162
 Year+0.158
 airplane+0.158
 cardboard+0.154
Neg logit contributions
ust-0.287
irk-0.243
 mind-0.237
oker-0.236
 forward-0.234
Token three
Token ID1115
Feature activation+1.98e+00

Pos logit contributions
lf+0.151
 airplane+0.127
 cardboard+0.121
 then+0.121
 Year+0.120
Neg logit contributions
ust-0.215
irk-0.177
oker-0.171
 possible-0.167
 forward-0.166
Token year
Token ID614
Feature activation-2.07e+00

Pos logit contributions
lf+0.051
 airplane+0.039
 then+0.039
 Year+0.038
 walls+0.038
Neg logit contributions
ust-0.068
irk-0.057
 possible-0.057
oker-0.055
 forward-0.054
Token old
Token ID1468
Feature activation+5.39e+00

Pos logit contributions
ust+0.071
irk+0.061
oker+0.058
 mind+0.057
 forward+0.056
Neg logit contributions
lf-0.050
 airplane-0.041
 parked-0.040
 then-0.040
 Year-0.040
Token couldn
Token ID3521
Feature activation+9.72e+00

Pos logit contributions
lf+0.127
 airplane+0.101
 parked+0.100
 Year+0.099
 cardboard+0.098
Neg logit contributions
ust-0.185
irk-0.158
oker-0.152
 mind-0.150
 forward-0.149
Token't
Token ID470
Feature activation-2.15e+00

Pos logit contributions
lf+0.189
 airplane+0.145
 parked+0.145
 cardboard+0.139
 set+0.136
Neg logit contributions
ust-0.384
irk-0.342
oker-0.320
 forward-0.313
osa-0.310
Token help
Token ID1037
Feature activation-9.30e-01

Pos logit contributions
ust+0.060
irk+0.051
oker+0.048
 forward+0.045
osa+0.045
Neg logit contributions
lf-0.065
 airplane-0.056
 then-0.056
 parked-0.055
 set-0.055
Token but
Token ID475
Feature activation+8.81e-01

Pos logit contributions
ust+0.033
irk+0.030
oker+0.029
ashes+0.028
umpy+0.028
Neg logit contributions
lf-0.023
 then-0.019
 cardboard-0.018
 airplane-0.018
 walls-0.018
Token notice
Token ID4003
Feature activation+1.01e+00

Pos logit contributions
lf+0.020
 then+0.020
 set+0.018
 airplane+0.017
 parked+0.017
Neg logit contributions
ust-0.031
irk-0.028
umpy-0.026
ashes-0.026
oker-0.026
Token a
Token ID257
Feature activation+3.25e+00

Pos logit contributions
lf+0.021
 airplane+0.017
 parked+0.016
 cardboard+0.016
 then+0.016
Neg logit contributions
ust-0.038
irk-0.033
oker-0.032
 forward-0.031
 mind-0.031
Token a
Token ID257
Feature activation+4.35e+00

Pos logit contributions
ust+0.192
 mind+0.161
irk+0.161
 forward+0.160
oker+0.158
Neg logit contributions
lf-0.129
 airplane-0.104
 Year-0.103
 cardboard-0.097
 walls-0.097
Token grape
Token ID30777
Feature activation+1.73e+00

Pos logit contributions
lf+0.123
 airplane+0.102
 parked+0.101
 Year+0.100
 cardboard+0.100
Neg logit contributions
ust-0.131
irk-0.108
oker-0.103
 forward-0.101
 mind-0.101
Token.
Token ID13
Feature activation-4.58e-02

Pos logit contributions
lf+0.045
 airplane+0.037
 parked+0.036
 then+0.036
 cardboard+0.035
Neg logit contributions
ust-0.057
irk-0.048
oker-0.045
 forward-0.045
 mind-0.045
Token But
Token ID887
Feature activation+8.61e-01

Pos logit contributions
ust+0.001
irk+0.001
oker+0.001
 forward+0.001
 possible+0.001
Neg logit contributions
lf-0.001
 airplane-0.001
 parked-0.001
 Year-0.001
 cardboard-0.001
Token he
Token ID339
Feature activation-3.42e-02

Pos logit contributions
lf+0.019
 airplane+0.015
 parked+0.015
 then+0.015
 cardboard+0.014
Neg logit contributions
ust-0.032
irk-0.027
oker-0.026
 forward-0.025
 mind-0.025
Token couldn
Token ID3521
Feature activation+9.71e+00

Pos logit contributions
ust+0.001
irk+0.001
oker+0.001
 forward+0.001
 possible+0.001
Neg logit contributions
lf-0.001
 parked-0.000
 airplane-0.000
 then-0.000
 cardboard-0.000
Token't
Token ID470
Feature activation-2.28e+00

Pos logit contributions
lf+0.161
 parked+0.117
 then+0.111
 airplane+0.110
 cardboard+0.108
Neg logit contributions
ust-0.414
irk-0.373
oker-0.344
ashing-0.340
 forward-0.340
Token reach
Token ID3151
Feature activation-1.62e+00

Pos logit contributions
ust+0.082
irk+0.073
oker+0.069
 forward+0.068
 possible+0.066
Neg logit contributions
lf-0.051
 then-0.044
 set-0.043
 store-0.041
 parked-0.040
Token,
Token ID11
Feature activation+1.62e+00

Pos logit contributions
ust+0.076
irk+0.067
oker+0.065
aring+0.064
apping+0.064
Neg logit contributions
lf-0.020
 then-0.018
 airplane-0.015
 set-0.014
 cardboard-0.014
Token so
Token ID523
Feature activation-5.64e-01

Pos logit contributions
lf+0.039
 then+0.033
 parked+0.032
 airplane+0.032
 cardboard+0.030
Neg logit contributions
ust-0.057
irk-0.049
oker-0.047
olding-0.044
 possible-0.044
Token his
Token ID465
Feature activation-2.67e+00

Pos logit contributions
ust+0.022
irk+0.020
oker+0.018
ashes+0.018
aring+0.018
Neg logit contributions
lf-0.012
 then-0.010
 airplane-0.009
 parked-0.009
 cardboard-0.009
Tokenâ
Token ID22940
Feature activation-8.08e-01

Pos logit contributions
ust+0.002
irk+0.002
oker+0.002
 possible+0.002
 mind+0.002
Neg logit contributions
lf-0.002
 parked-0.002
 Year-0.002
 airplane-0.002
 then-0.002
Token
Token ID26391
Feature activation+1.83e+00

Pos logit contributions
ust+0.024
irk+0.019
oker+0.018
 forward+0.018
 mind+0.018
Neg logit contributions
lf-0.024
 airplane-0.020
 cardboard-0.020
 Year-0.019
 walls-0.019
Token
Token ID8151
Feature activation-1.87e+00

Pos logit contributions
lf+0.046
 parked+0.036
 Year+0.036
 airplane+0.036
 then+0.035
Neg logit contributions
ust-0.062
irk-0.053
oker-0.050
 forward-0.049
 possible-0.049
Token The
Token ID383
Feature activation+6.33e+00

Pos logit contributions
ust+0.051
irk+0.042
oker+0.040
 possible+0.039
 mind+0.039
Neg logit contributions
lf-0.057
 parked-0.049
 airplane-0.047
 then-0.047
 Year-0.046
Token crab
Token ID32202
Feature activation+2.40e+00

Pos logit contributions
lf+0.155
 airplane+0.125
 parked+0.123
 Year+0.122
 cardboard+0.121
Neg logit contributions
ust-0.214
irk-0.181
oker-0.174
 forward-0.171
 mind-0.170
Token couldn
Token ID3521
Feature activation+9.70e+00

Pos logit contributions
lf+0.041
 parked+0.029
 then+0.029
 airplane+0.028
 set+0.027
Neg logit contributions
ust-0.101
irk-0.089
oker-0.085
 forward-0.085
 possible-0.085
Tokenâ
Token ID22940
Feature activation-8.41e-01

Pos logit contributions
lf+0.151
 parked+0.110
 then+0.105
 airplane+0.102
 set+0.102
Neg logit contributions
ust-0.417
irk-0.377
oker-0.350
 forward-0.344
osa-0.343
Token
Token ID26391
Feature activation+1.32e+00

Pos logit contributions
ust+0.023
irk+0.019
oker+0.018
 forward+0.018
 possible+0.017
Neg logit contributions
lf-0.026
 airplane-0.022
 parked-0.022
 Year-0.021
 cardboard-0.021
Token
Token ID8151
Feature activation-2.18e+00

Pos logit contributions
lf+0.035
 parked+0.028
 Year+0.028
 airplane+0.028
 then+0.027
Neg logit contributions
ust-0.043
irk-0.036
oker-0.034
 forward-0.034
 possible-0.034
Tokent
Token ID83
Feature activation-2.90e+00

Pos logit contributions
ust+0.054
irk+0.047
rys+0.044
oker+0.043
 mind+0.043
Neg logit contributions
lf-0.068
 parked-0.060
 then-0.057
ons-0.055
 set-0.055
Token believe
Token ID1975
Feature activation-5.02e+00

Pos logit contributions
ust+0.084
irk+0.074
osa+0.068
oker+0.068
apping+0.065
Neg logit contributions
lf-0.081
 then-0.073
 parked-0.070
 set-0.070
 airplane-0.069
Token was
Token ID373
Feature activation+3.49e+00

Pos logit contributions
ust+0.044
aring+0.032
 forward+0.031
irk+0.030
Aud+0.029
Neg logit contributions
 island-0.052
 Year-0.052
 airplane-0.052
 fridge-0.052
lf-0.052
Token so
Token ID523
Feature activation+3.50e-01

Pos logit contributions
lf+0.083
 then+0.076
 parked+0.072
 set+0.072
 airplane+0.070
Neg logit contributions
ust-0.119
irk-0.106
oker-0.098
aring-0.094
umble-0.092
Token excited
Token ID6568
Feature activation+4.75e+00

Pos logit contributions
lf+0.009
 then+0.008
 set+0.008
 parked+0.007
 walls+0.007
Neg logit contributions
ust-0.011
irk-0.010
 loser-0.009
oker-0.009
aring-0.009
Token!
Token ID0
Feature activation-4.48e-01

Pos logit contributions
lf+0.104
 then+0.086
 airplane+0.086
 set+0.083
 parked+0.079
Neg logit contributions
ust-0.178
irk-0.156
 loser-0.152
oker-0.150
rys-0.144
Token He
Token ID679
Feature activation+4.16e+00

Pos logit contributions
ust+0.014
irk+0.012
oker+0.011
 possible+0.011
 forward+0.011
Neg logit contributions
lf-0.012
 parked-0.010
 airplane-0.010
 Year-0.010
 cardboard-0.009
Token couldn
Token ID3521
Feature activation+9.66e+00

Pos logit contributions
lf+0.084
 Year+0.066
 airplane+0.065
 cardboard+0.064
 parked+0.061
Neg logit contributions
ust-0.155
irk-0.131
 mind-0.129
 forward-0.127
oker-0.126
Token't
Token ID470
Feature activation-2.10e+00

Pos logit contributions
lf+0.172
 parked+0.131
 airplane+0.126
 then+0.121
 set+0.121
Neg logit contributions
ust-0.398
irk-0.357
oker-0.331
ashing-0.325
 forward-0.324
Token wait
Token ID4043
Feature activation-4.23e+00

Pos logit contributions
ust+0.060
irk+0.050
oker+0.047
elly+0.044
 forward+0.044
Neg logit contributions
lf-0.064
 then-0.056
 airplane-0.055
 parked-0.054
 set-0.054
Token to
Token ID284
Feature activation+1.52e+00

Pos logit contributions
ust+0.126
irk+0.105
oker+0.100
umpy+0.095
 loser+0.093
Neg logit contributions
lf-0.122
 airplane-0.106
 then-0.106
 set-0.104
 cardboard-0.102
Token show
Token ID905
Feature activation-6.53e+00

Pos logit contributions
lf+0.037
 airplane+0.030
 parked+0.030
 set+0.030
 cardboard+0.030
Neg logit contributions
ust-0.054
irk-0.046
oker-0.044
olding-0.042
 forward-0.042
Token his
Token ID465
Feature activation-2.68e+00

Pos logit contributions
ust+0.261
irk+0.234
umpy+0.228
ashing+0.224
elly+0.222
Neg logit contributions
lf-0.137
 airplane-0.116
 then-0.115
 cardboard-0.103
 walls-0.103
Token her
Token ID607
Feature activation-1.36e+00

Pos logit contributions
lf+0.055
 airplane+0.046
 parked+0.044
 then+0.044
 cardboard+0.044
Neg logit contributions
ust-0.066
irk-0.055
oker-0.052
 forward-0.051
 mind-0.050
Token new
Token ID649
Feature activation-1.60e+00

Pos logit contributions
ust+0.043
irk+0.036
oker+0.034
 forward+0.033
 possible+0.033
Neg logit contributions
lf-0.037
 airplane-0.031
 parked-0.030
 cardboard-0.030
 Year-0.030
Token rug
Token ID14477
Feature activation+3.99e+00

Pos logit contributions
ust+0.056
irk+0.047
oker+0.045
 forward+0.044
aring+0.043
Neg logit contributions
lf-0.039
 airplane-0.032
 cardboard-0.031
 Year-0.030
 parked-0.030
Token that
Token ID326
Feature activation+2.75e+00

Pos logit contributions
lf+0.110
 airplane+0.091
 parked+0.090
 Year+0.089
 cardboard+0.089
Neg logit contributions
ust-0.122
irk-0.101
oker-0.097
 forward-0.095
 mind-0.095
Token she
Token ID673
Feature activation-9.18e-01

Pos logit contributions
lf+0.066
 set+0.051
 cardboard+0.050
 airplane+0.050
 then+0.050
Neg logit contributions
ust-0.104
irk-0.094
aring-0.087
olding-0.086
ashes-0.085
Token couldn
Token ID3521
Feature activation+9.65e+00

Pos logit contributions
ust+0.034
irk+0.029
oker+0.028
 mind+0.028
 forward+0.028
Neg logit contributions
lf-0.019
 airplane-0.015
 Year-0.015
 cardboard-0.014
 parked-0.014
Token't
Token ID470
Feature activation-2.37e+00

Pos logit contributions
lf+0.182
 airplane+0.135
 parked+0.134
 cardboard+0.129
 Year+0.128
Neg logit contributions
ust-0.387
irk-0.341
oker-0.324
 forward-0.318
 mind-0.315
Token stop
Token ID2245
Feature activation-6.15e+00

Pos logit contributions
ust+0.078
irk+0.068
oker+0.064
 forward+0.062
osa+0.061
Neg logit contributions
lf-0.062
 airplane-0.051
 then-0.051
 parked-0.051
 set-0.050
Token smiling
Token ID16755
Feature activation-2.12e+00

Pos logit contributions
ust+0.231
irk+0.198
oker+0.195
olding+0.180
aring+0.177
Neg logit contributions
lf-0.137
 airplane-0.119
 then-0.117
 walls-0.111
 Year-0.109
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
ust+0.068
irk+0.061
oker+0.059
rys+0.056
 possible+0.055
Neg logit contributions
lf-0.055
 airplane-0.048
 walls-0.046
 then-0.046
 set-0.046
Token She
Token ID1375
Feature activation+2.56e+00

Pos logit contributions
ust+0.036
irk+0.031
oker+0.030
 possible+0.029
 forward+0.029
Neg logit contributions
lf-0.027
 parked-0.022
 Year-0.022
 airplane-0.021
 then-0.021
Token.
Token ID13
Feature activation-2.50e-01

Pos logit contributions
lf+0.124
 Year+0.103
 cardboard+0.098
 airplane+0.097
 parked+0.094
Neg logit contributions
ust-0.180
irk-0.151
 forward-0.149
 mind-0.148
 possible-0.147
Token
Token ID198
Feature activation-8.41e-01

Pos logit contributions
ust+0.007
irk+0.006
oker+0.006
 forward+0.006
 mind+0.006
Neg logit contributions
lf-0.007
 airplane-0.006
 parked-0.006
 Year-0.006
 cardboard-0.006
Token
Token ID198
Feature activation-1.06e+00

Pos logit contributions
ust+0.026
irk+0.021
oker+0.020
 forward+0.020
 mind+0.019
Neg logit contributions
lf-0.024
 airplane-0.020
 Year-0.019
 parked-0.019
 cardboard-0.019
TokenThe
Token ID464
Feature activation+5.57e+00

Pos logit contributions
ust+0.038
irk+0.032
oker+0.031
 forward+0.030
 mind+0.030
Neg logit contributions
lf-0.025
 airplane-0.020
 parked-0.019
 Year-0.019
 cardboard-0.019
Token girl
Token ID2576
Feature activation+2.27e+00

Pos logit contributions
lf+0.137
ons+0.108
 parked+0.107
 Year+0.104
pped+0.103
Neg logit contributions
ust-0.175
 mind-0.152
 possible-0.147
irk-0.147
 loser-0.147
Token couldn
Token ID3521
Feature activation+9.61e+00

Pos logit contributions
lf+0.048
 cardboard+0.037
 airplane+0.036
 Year+0.036
 fridge+0.034
Neg logit contributions
ust-0.081
irk-0.069
 mind-0.069
 forward-0.066
aring-0.065
Token't
Token ID470
Feature activation-2.46e+00

Pos logit contributions
lf+0.170
 airplane+0.125
 parked+0.124
 Year+0.119
 then+0.118
Neg logit contributions
ust-0.395
irk-0.350
oker-0.332
 mind-0.325
 forward-0.325
Token believe
Token ID1975
Feature activation-4.88e+00

Pos logit contributions
ust+0.068
irk+0.055
oker+0.053
 forward+0.051
 possible+0.050
Neg logit contributions
lf-0.077
 airplane-0.066
 parked-0.064
 cardboard-0.063
 then-0.063
Token it
Token ID340
Feature activation-5.83e-01

Pos logit contributions
ust+0.169
irk+0.145
oker+0.140
 forward+0.135
 mind+0.134
Neg logit contributions
lf-0.120
 airplane-0.095
 parked-0.092
 Year-0.091
 cardboard-0.091
Token!
Token ID0
Feature activation-9.18e-01

Pos logit contributions
ust+0.015
 forward+0.011
irk+0.011
 possible+0.011
 mind+0.010
Neg logit contributions
lf-0.020
 Year-0.017
 airplane-0.017
 parked-0.017
 cardboard-0.016
Token She
Token ID1375
Feature activation+2.62e+00

Pos logit contributions
ust+0.028
irk+0.023
oker+0.022
 forward+0.021
 mind+0.021
Neg logit contributions
lf-0.026
 airplane-0.021
 parked-0.021
 Year-0.021
 cardboard-0.021
Token the
Token ID262
Feature activation+3.29e+00

Pos logit contributions
lf+0.032
 then+0.026
 airplane+0.026
 parked+0.025
 Year+0.024
Neg logit contributions
ust-0.048
irk-0.042
oker-0.039
umpy-0.038
 forward-0.038
Token music
Token ID2647
Feature activation+7.43e-01

Pos logit contributions
lf+0.078
 airplane+0.062
 parked+0.061
 Year+0.061
 cardboard+0.060
Neg logit contributions
ust-0.114
irk-0.096
oker-0.093
 forward-0.091
 mind-0.091
Token man
Token ID582
Feature activation+2.70e+00

Pos logit contributions
lf+0.012
 airplane+0.008
 set+0.008
 then+0.008
 parked+0.007
Neg logit contributions
ust-0.034
irk-0.030
 possible-0.028
oker-0.028
 forward-0.028
Token said
Token ID531
Feature activation+2.95e+00

Pos logit contributions
lf+0.052
 parked+0.041
 then+0.040
 airplane+0.039
 set+0.038
Neg logit contributions
ust-0.107
irk-0.094
oker-0.091
 possible-0.090
 forward-0.088
Token he
Token ID339
Feature activation-5.62e-01

Pos logit contributions
lf+0.061
 then+0.050
 airplane+0.046
 parked+0.045
 set+0.044
Neg logit contributions
ust-0.115
irk-0.100
oker-0.096
ashes-0.093
 possible-0.092
Token couldn
Token ID3521
Feature activation+9.59e+00

Pos logit contributions
ust+0.023
irk+0.020
oker+0.019
 forward+0.019
 mind+0.019
Neg logit contributions
lf-0.010
 airplane-0.007
 Year-0.007
 cardboard-0.007
 parked-0.007
Token't
Token ID470
Feature activation-2.41e+00

Pos logit contributions
lf+0.146
 parked+0.103
 then+0.099
 airplane+0.097
 set+0.094
Neg logit contributions
ust-0.427
irk-0.387
ashing-0.357
oker-0.356
osa-0.352
Token stay
Token ID2652
Feature activation-3.35e+00

Pos logit contributions
ust+0.085
irk+0.077
osa+0.073
 unnecessary+0.072
oker+0.072
Neg logit contributions
lf-0.054
 then-0.051
 set-0.050
 airplane-0.046
 parked-0.046
Token.
Token ID13
Feature activation-8.44e-01

Pos logit contributions
ust+0.115
irk+0.099
oker+0.099
umble+0.094
rav+0.094
Neg logit contributions
lf-0.079
 then-0.078
 airplane-0.076
 set-0.076
 parked-0.076
Token Everyone
Token ID11075
Feature activation-5.45e+00

Pos logit contributions
ust+0.031
irk+0.026
 possible+0.026
oker+0.025
 forward+0.025
Neg logit contributions
lf-0.018
 then-0.015
 parked-0.015
 Year-0.014
 set-0.014
Token was
Token ID373
Feature activation+2.10e+00

Pos logit contributions
ust+0.242
 possible+0.219
irk+0.217
 fault+0.214
 loser+0.213
Neg logit contributions
lf-0.079
 then-0.068
 parked-0.058
 set-0.057
 airplane-0.049
Token slide
Token ID10649
Feature activation+1.79e+00

Pos logit contributions
lf+0.089
 airplane+0.070
 parked+0.068
 Year+0.068
 cardboard+0.067
Neg logit contributions
ust-0.149
irk-0.127
oker-0.123
 forward-0.120
 mind-0.120
Token!
Token ID0
Feature activation-4.10e-01

Pos logit contributions
lf+0.046
 airplane+0.038
 parked+0.038
 Year+0.037
 cardboard+0.037
Neg logit contributions
ust-0.058
irk-0.049
oker-0.047
 mind-0.045
 forward-0.045
Token Y
Token ID575
Feature activation-4.36e+00

Pos logit contributions
ust+0.013
irk+0.011
oker+0.010
 possible+0.010
 mind+0.010
Neg logit contributions
lf-0.011
 parked-0.009
 airplane-0.009
 Year-0.009
 then-0.009
Tokenol
Token ID349
Feature activation-3.25e+00

Pos logit contributions
ust+0.143
irk+0.120
oker+0.116
 forward+0.113
 mind+0.113
Neg logit contributions
lf-0.113
 airplane-0.092
 parked-0.090
 Year-0.090
 cardboard-0.089
Tokenanda
Token ID5282
Feature activation+1.42e+00

Pos logit contributions
ust+0.114
irk+0.097
 forward+0.092
 possible+0.091
oker+0.090
Neg logit contributions
lf-0.077
 then-0.063
 airplane-0.061
 parked-0.060
ons-0.059
Token couldn
Token ID3521
Feature activation+9.58e+00

Pos logit contributions
lf+0.031
 airplane+0.025
 parked+0.024
 then+0.024
 Year+0.024
Neg logit contributions
ust-0.052
irk-0.045
oker-0.043
 forward-0.042
 possible-0.042
Token't
Token ID470
Feature activation-2.49e+00

Pos logit contributions
lf+0.162
 parked+0.119
 airplane+0.116
 then+0.112
 set+0.111
Neg logit contributions
ust-0.399
irk-0.362
oker-0.336
 forward-0.333
 mind-0.329
Token believe
Token ID1975
Feature activation-4.91e+00

Pos logit contributions
ust+0.081
irk+0.074
osa+0.068
elly+0.066
oker+0.066
Neg logit contributions
lf-0.064
 then-0.057
 set-0.055
 parked-0.054
 airplane-0.053
Token it
Token ID340
Feature activation-6.22e-01

Pos logit contributions
ust+0.193
irk+0.173
oker+0.165
ashes+0.161
 loser+0.160
Neg logit contributions
lf-0.098
 airplane-0.074
 then-0.070
 Year-0.066
 parked-0.066
Token.
Token ID13
Feature activation-1.03e+00

Pos logit contributions
ust+0.020
irk+0.017
oker+0.016
 mind+0.016
 forward+0.016
Neg logit contributions
lf-0.016
 airplane-0.013
 parked-0.013
 cardboard-0.013
 then-0.013
Token She
Token ID1375
Feature activation+2.44e+00

Pos logit contributions
ust+0.034
irk+0.028
oker+0.027
 possible+0.027
 forward+0.027
Neg logit contributions
lf-0.026
 parked-0.021
 airplane-0.021
 Year-0.021
 cardboard-0.021
Token her
Token ID607
Feature activation-1.36e+00

Pos logit contributions
lf+0.053
 parked+0.048
 airplane+0.047
 then+0.046
 set+0.045
Neg logit contributions
ust-0.073
irk-0.069
oker-0.066
umble-0.063
osa-0.062
Token room
Token ID2119
Feature activation+3.16e+00

Pos logit contributions
ust+0.043
irk+0.036
oker+0.034
 forward+0.033
 possible+0.033
Neg logit contributions
lf-0.037
 airplane-0.031
 parked-0.030
 cardboard-0.030
 Year-0.030
Token.
Token ID13
Feature activation-3.71e-02

Pos logit contributions
lf+0.087
 airplane+0.072
 parked+0.071
 cardboard+0.069
 set+0.069
Neg logit contributions
ust-0.100
irk-0.084
oker-0.079
 forward-0.077
 possible-0.076
Token But
Token ID887
Feature activation+7.61e-01

Pos logit contributions
ust+0.001
irk+0.001
oker+0.001
 forward+0.001
 possible+0.001
Neg logit contributions
lf-0.001
 airplane-0.001
 parked-0.001
 Year-0.001
 cardboard-0.001
Token she
Token ID673
Feature activation-8.52e-01

Pos logit contributions
lf+0.018
 then+0.014
 airplane+0.014
 parked+0.014
 cardboard+0.013
Neg logit contributions
ust-0.028
irk-0.024
oker-0.023
umpy-0.022
 forward-0.022
Token couldn
Token ID3521
Feature activation+9.57e+00

Pos logit contributions
ust+0.033
irk+0.029
oker+0.028
 forward+0.028
 possible+0.028
Neg logit contributions
lf-0.017
 parked-0.012
 airplane-0.012
 then-0.012
 Year-0.012
Token't
Token ID470
Feature activation-2.38e+00

Pos logit contributions
lf+0.168
 parked+0.121
 airplane+0.119
 Year+0.114
 cardboard+0.114
Neg logit contributions
ust-0.399
irk-0.356
oker-0.335
 forward-0.328
 mind-0.327
Token find
Token ID1064
Feature activation+2.70e-01

Pos logit contributions
ust+0.083
irk+0.073
oker+0.070
 forward+0.067
 possible+0.066
Neg logit contributions
lf-0.057
 airplane-0.046
 then-0.046
 parked-0.046
 set-0.045
Token it
Token ID340
Feature activation-7.41e-01

Pos logit contributions
lf+0.005
 cardboard+0.004
 airplane+0.004
 then+0.004
 walls+0.004
Neg logit contributions
ust-0.011
irk-0.010
umpy-0.009
oker-0.009
aring-0.009
Token because
Token ID780
Feature activation-3.72e+00

Pos logit contributions
ust+0.022
irk+0.020
oker+0.019
rys+0.018
ashes+0.018
Neg logit contributions
lf-0.020
 airplane-0.017
 parked-0.017
 then-0.017
 set-0.017
Token her
Token ID607
Feature activation-3.65e+00

Pos logit contributions
ust+0.138
irk+0.121
aring+0.115
ashes+0.114
oker+0.113
Neg logit contributions
lf-0.087
 airplane-0.068
 cardboard-0.067
 then-0.066
 parked-0.066
Token
Token ID198
Feature activation-6.87e-01

Pos logit contributions
lf+0.001
 airplane+0.001
 parked+0.001
 Year+0.001
 cardboard+0.001
Neg logit contributions
ust-0.001
irk-0.001
oker-0.001
 forward-0.001
 mind-0.001
Token
Token ID198
Feature activation-9.38e-01

Pos logit contributions
ust+0.022
irk+0.018
oker+0.017
 forward+0.017
 mind+0.017
Neg logit contributions
lf-0.018
 airplane-0.015
 Year-0.015
 parked-0.015
 cardboard-0.015
TokenM
Token ID44
Feature activation-2.13e+00

Pos logit contributions
ust+0.033
irk+0.028
oker+0.027
 mind+0.026
 forward+0.026
Neg logit contributions
lf-0.022
 airplane-0.017
 parked-0.017
 Year-0.017
 cardboard-0.017
Tokenagg
Token ID9460
Feature activation-3.06e+00

Pos logit contributions
ust+0.062
irk+0.050
oker+0.047
 mind+0.046
 forward+0.046
Neg logit contributions
lf-0.064
 airplane-0.054
 parked-0.054
 cardboard-0.053
 Year-0.053
Tokenie
Token ID494
Feature activation+1.31e+00

Pos logit contributions
ust+0.107
irk+0.085
osa+0.083
 mind+0.083
umble+0.082
Neg logit contributions
lf-0.067
 cardboard-0.060
 airplane-0.058
 fridge-0.057
 Year-0.056
Token couldn
Token ID3521
Feature activation+9.55e+00

Pos logit contributions
lf+0.027
 airplane+0.021
 cardboard+0.021
 Year+0.020
 parked+0.020
Neg logit contributions
ust-0.048
irk-0.041
 mind-0.040
oker-0.039
 forward-0.039
Token't
Token ID470
Feature activation-2.44e+00

Pos logit contributions
lf+0.140
 then+0.097
 parked+0.096
 set+0.092
 airplane+0.092
Neg logit contributions
ust-0.420
irk-0.384
oker-0.356
ashing-0.354
osa-0.348
Token help
Token ID1037
Feature activation-9.86e-01

Pos logit contributions
ust+0.072
irk+0.062
oker+0.059
elly+0.057
apping+0.056
Neg logit contributions
lf-0.071
 then-0.063
 set-0.061
 airplane-0.060
 parked-0.059
Token but
Token ID475
Feature activation+9.31e-01

Pos logit contributions
ust+0.035
irk+0.032
oker+0.032
ashes+0.030
umpy+0.030
Neg logit contributions
lf-0.024
 then-0.020
 walls-0.018
 airplane-0.018
 cardboard-0.018
Token smile
Token ID8212
Feature activation-1.11e+00

Pos logit contributions
 then+0.021
lf+0.020
 set+0.018
 store+0.017
 airplane+0.017
Neg logit contributions
ust-0.033
irk-0.030
umpy-0.029
ashes-0.029
elly-0.029
Token.
Token ID13
Feature activation-1.18e+00

Pos logit contributions
ust+0.035
irk+0.030
oker+0.029
 possible+0.027
olding+0.027
Neg logit contributions
lf-0.030
 airplane-0.025
 then-0.025
 parked-0.025
 set-0.024
Token the
Token ID262
Feature activation+3.37e+00

Pos logit contributions
ust+0.034
irk+0.029
oker+0.027
aring+0.027
olding+0.027
Neg logit contributions
lf-0.022
 airplane-0.018
 cardboard-0.018
 walls-0.017
 then-0.017
Token bed
Token ID3996
Feature activation+5.80e-01

Pos logit contributions
lf+0.059
 airplane+0.043
 parked+0.042
 Year+0.042
 cardboard+0.040
Neg logit contributions
ust-0.137
irk-0.119
oker-0.116
 mind-0.114
 forward-0.114
Token -
Token ID532
Feature activation+2.48e+00

Pos logit contributions
lf+0.016
 airplane+0.013
 walls+0.013
 parked+0.013
 set+0.013
Neg logit contributions
ust-0.019
irk-0.016
oker-0.015
 possible-0.015
 forward-0.014
Token but
Token ID475
Feature activation+1.07e+00

Pos logit contributions
lf+0.058
 parked+0.047
 airplane+0.047
 then+0.046
 cardboard+0.046
Neg logit contributions
ust-0.091
irk-0.079
oker-0.074
aring-0.072
ashes-0.071
Token he
Token ID339
Feature activation-2.06e-01

Pos logit contributions
lf+0.026
 then+0.022
 parked+0.021
 airplane+0.020
 set+0.020
Neg logit contributions
ust-0.038
irk-0.034
oker-0.032
umpy-0.031
ashes-0.031
Token couldn
Token ID3521
Feature activation+9.55e+00

Pos logit contributions
ust+0.009
irk+0.008
oker+0.008
 possible+0.007
 forward+0.007
Neg logit contributions
lf-0.003
 parked-0.002
 then-0.002
 airplane-0.002
 set-0.002
Token't
Token ID470
Feature activation-2.14e+00

Pos logit contributions
lf+0.164
 parked+0.118
 airplane+0.114
 then+0.110
 cardboard+0.109
Neg logit contributions
ust-0.404
irk-0.363
oker-0.339
 forward-0.332
olding-0.332
Token find
Token ID1064
Feature activation+2.98e-01

Pos logit contributions
ust+0.077
irk+0.068
oker+0.066
 forward+0.062
 possible+0.062
Neg logit contributions
lf-0.049
 then-0.040
 set-0.039
 airplane-0.039
 parked-0.039
Token it
Token ID340
Feature activation-4.85e-01

Pos logit contributions
lf+0.006
 cardboard+0.005
 airplane+0.005
 then+0.005
 parked+0.005
Neg logit contributions
ust-0.012
irk-0.010
oker-0.010
umpy-0.010
aring-0.010
Token.
Token ID13
Feature activation-9.10e-01

Pos logit contributions
ust+0.015
irk+0.014
oker+0.014
rys+0.013
ashes+0.013
Neg logit contributions
lf-0.012
 airplane-0.010
 parked-0.010
 then-0.010
 set-0.010
Token
Token ID198
Feature activation-1.67e+00

Pos logit contributions
ust+0.031
irk+0.027
oker+0.026
 possible+0.026
 forward+0.025
Neg logit contributions
lf-0.022
 parked-0.018
 then-0.017
 Year-0.017
 airplane-0.017
Token Benny
Token ID44275
Feature activation+3.30e+00

Pos logit contributions
lf+0.008
 airplane+0.007
 Year+0.006
 parked+0.006
 cardboard+0.006
Neg logit contributions
ust-0.008
irk-0.006
 forward-0.006
 mind-0.006
oker-0.006
Token was
Token ID373
Feature activation+3.10e+00

Pos logit contributions
lf+0.082
 Year+0.069
 airplane+0.066
 cardboard+0.065
 walls+0.062
Neg logit contributions
ust-0.107
 mind-0.090
umpy-0.086
 than-0.085
irk-0.085
Token very
Token ID845
Feature activation-3.91e+00

Pos logit contributions
lf+0.089
 airplane+0.074
 Year+0.074
 parked+0.071
 cardboard+0.071
Neg logit contributions
ust-0.092
irk-0.075
 forward-0.073
 mind-0.073
 possible-0.072
Token restless
Token ID42537
Feature activation+3.34e+00

Pos logit contributions
ust+0.116
irk+0.101
oker+0.093
 fault+0.089
 anymore+0.088
Neg logit contributions
lf-0.107
 then-0.094
 parked-0.092
 set-0.089
 cardboard-0.089
Token and
Token ID290
Feature activation+2.20e+00

Pos logit contributions
lf+0.077
 airplane+0.062
 parked+0.061
 then+0.061
 cardboard+0.059
Neg logit contributions
ust-0.120
irk-0.101
oker-0.099
 possible-0.094
 forward-0.094
Token couldn
Token ID3521
Feature activation+9.53e+00

Pos logit contributions
lf+0.044
 Year+0.035
 airplane+0.034
 cardboard+0.033
 parked+0.032
Neg logit contributions
ust-0.084
irk-0.070
 mind-0.069
oker-0.069
 forward-0.069
Token't
Token ID470
Feature activation-2.28e+00

Pos logit contributions
lf+0.193
 airplane+0.147
 parked+0.145
 Year+0.143
 cardboard+0.141
Neg logit contributions
ust-0.365
irk-0.317
oker-0.304
 forward-0.299
 mind-0.298
Token sit
Token ID1650
Feature activation-1.90e+00

Pos logit contributions
ust+0.074
irk+0.063
oker+0.060
 forward+0.058
 possible+0.058
Neg logit contributions
lf-0.060
 airplane-0.049
 parked-0.049
 then-0.048
 cardboard-0.047
Token still
Token ID991
Feature activation-2.00e+00

Pos logit contributions
ust+0.055
irk+0.046
oker+0.045
 mind+0.042
 possible+0.042
Neg logit contributions
lf-0.056
 parked-0.048
 airplane-0.048
 then-0.046
 Year-0.046
Token.
Token ID13
Feature activation-1.02e+00

Pos logit contributions
ust+0.069
irk+0.062
oker+0.059
osa+0.055
rys+0.055
Neg logit contributions
lf-0.048
 then-0.042
 set-0.040
 parked-0.040
 airplane-0.039
Token He
Token ID679
Feature activation+3.86e+00

Pos logit contributions
ust+0.030
irk+0.024
 mind+0.024
 forward+0.023
oker+0.023
Neg logit contributions
lf-0.030
 airplane-0.025
 Year-0.025
 parked-0.024
 cardboard-0.024
Tokenmy
Token ID1820
Feature activation+2.87e+00

Pos logit contributions
lf+0.014
 then+0.012
 airplane+0.012
 parked+0.012
 cardboard+0.012
Neg logit contributions
ust-0.015
irk-0.012
oker-0.011
 possible-0.011
 mind-0.011
Token was
Token ID373
Feature activation+3.20e+00

Pos logit contributions
lf+0.076
 Year+0.062
 cardboard+0.060
 airplane+0.058
 walls+0.057
Neg logit contributions
ust-0.088
 mind-0.076
irk-0.074
 forward-0.074
oker-0.069
Token sad
Token ID6507
Feature activation+3.45e+00

Pos logit contributions
lf+0.086
 airplane+0.072
 parked+0.072
 then+0.071
 cardboard+0.070
Neg logit contributions
ust-0.102
irk-0.087
oker-0.082
aring-0.077
 forward-0.077
Token because
Token ID780
Feature activation-3.43e+00

Pos logit contributions
 then+0.064
lf+0.062
 airplane+0.054
 set+0.050
 cardboard+0.049
Neg logit contributions
ust-0.133
irk-0.119
oker-0.118
rys-0.113
beh-0.113
Token he
Token ID339
Feature activation-2.10e-01

Pos logit contributions
ust+0.117
irk+0.100
oker+0.094
 forward+0.092
aring+0.091
Neg logit contributions
lf-0.088
 airplane-0.071
 parked-0.069
 cardboard-0.069
 then-0.067
Token couldn
Token ID3521
Feature activation+9.53e+00

Pos logit contributions
ust+0.008
irk+0.007
oker+0.007
 mind+0.007
 forward+0.007
Neg logit contributions
lf-0.004
 airplane-0.003
 Year-0.003
 parked-0.003
 cardboard-0.003
Token't
Token ID470
Feature activation-2.36e+00

Pos logit contributions
lf+0.168
 parked+0.119
 airplane+0.116
 cardboard+0.113
 then+0.111
Neg logit contributions
ust-0.400
irk-0.360
oker-0.336
 forward-0.332
 mind-0.329
Token drink
Token ID4144
Feature activation-2.53e+00

Pos logit contributions
ust+0.082
irk+0.071
oker+0.068
 forward+0.067
 possible+0.066
Neg logit contributions
lf-0.057
 airplane-0.045
 parked-0.045
 then-0.044
 cardboard-0.044
Token his
Token ID465
Feature activation-2.59e+00

Pos logit contributions
ust+0.094
irk+0.087
oker+0.080
aring+0.080
apping+0.079
Neg logit contributions
lf-0.052
 cardboard-0.048
 then-0.048
 walls-0.047
 set-0.046
Token juice
Token ID13135
Feature activation-3.51e-01

Pos logit contributions
ust+0.085
irk+0.072
oker+0.069
 forward+0.067
 possible+0.067
Neg logit contributions
lf-0.066
 airplane-0.054
 cardboard-0.053
 Year-0.053
 parked-0.053
Token.
Token ID13
Feature activation-9.98e-01

Pos logit contributions
ust+0.011
irk+0.010
oker+0.009
rys+0.009
aring+0.009
Neg logit contributions
lf-0.009
 set-0.008
 then-0.008
 airplane-0.008
 store-0.008
Token.
Token ID13
Feature activation+4.48e-01

Pos logit contributions
lf+0.054
 airplane+0.045
 parked+0.044
 then+0.043
 cardboard+0.043
Neg logit contributions
ust-0.069
irk-0.058
oker-0.056
 mind-0.054
 forward-0.054
Token
Token ID220
Feature activation-1.74e-01

Pos logit contributions
lf+0.013
 airplane+0.010
 parked+0.010
 Year+0.010
 cardboard+0.010
Neg logit contributions
ust-0.014
irk-0.011
oker-0.011
 forward-0.010
 mind-0.010
Token
Token ID198
Feature activation-9.56e-01

Pos logit contributions
ust+0.006
irk+0.005
oker+0.005
 mind+0.005
 possible+0.005
Neg logit contributions
lf-0.004
 parked-0.003
 airplane-0.003
 Year-0.003
 cardboard-0.003
TokenKat
Token ID25881
Feature activation-3.92e+00

Pos logit contributions
ust+0.028
irk+0.023
oker+0.022
 forward+0.021
 mind+0.021
Neg logit contributions
lf-0.028
 airplane-0.024
 cardboard-0.023
 Year-0.023
 parked-0.023
Tokeny
Token ID88
Feature activation+2.27e+00

Pos logit contributions
ust+0.133
irk+0.104
osa+0.100
 mind+0.096
umble+0.095
Neg logit contributions
lf-0.091
 cardboard-0.084
 Year-0.078
 airplane-0.077
 engine-0.077
Token couldn
Token ID3521
Feature activation+9.53e+00

Pos logit contributions
lf+0.042
 cardboard+0.033
 airplane+0.033
 Year+0.032
 parked+0.031
Neg logit contributions
ust-0.087
irk-0.074
 mind-0.072
oker-0.071
 forward-0.071
Tokenâ
Token ID22940
Feature activation-1.22e+00

Pos logit contributions
lf+0.133
 then+0.094
 parked+0.094
 set+0.089
 store+0.085
Neg logit contributions
ust-0.424
irk-0.389
oker-0.355
osa-0.355
ashing-0.354
Token
Token ID26391
Feature activation+8.34e-01

Pos logit contributions
ust+0.033
irk+0.027
oker+0.026
 forward+0.025
 possible+0.024
Neg logit contributions
lf-0.038
 parked-0.032
 airplane-0.032
 Year-0.032
 then-0.031
Token
Token ID8151
Feature activation-2.54e+00

Pos logit contributions
lf+0.023
 parked+0.019
 airplane+0.019
 Year+0.019
 cardboard+0.018
Neg logit contributions
ust-0.026
irk-0.022
oker-0.021
 forward-0.020
 possible-0.020
Tokent
Token ID83
Feature activation-3.22e+00

Pos logit contributions
ust+0.068
irk+0.060
rys+0.059
 unnecessary+0.059
 mind+0.057
Neg logit contributions
lf-0.072
 parked-0.064
 then-0.062
 set-0.058
ons-0.058
Token wait
Token ID4043
Feature activation-4.48e+00

Pos logit contributions
ust+0.086
irk+0.075
oker+0.069
osa+0.068
elly+0.066
Neg logit contributions
lf-0.099
 then-0.092
 set-0.087
 parked-0.085
 airplane-0.084
Token!
Token ID0
Feature activation+1.35e+00

Pos logit contributions
ust+0.066
irk+0.055
oker+0.053
 forward+0.053
 possible+0.052
Neg logit contributions
lf-0.044
 airplane-0.036
 cardboard-0.034
 Year-0.034
 parked-0.034
Token Even
Token ID3412
Feature activation-1.59e+00

Pos logit contributions
lf+0.037
 airplane+0.031
 parked+0.031
 Year+0.030
 cardboard+0.030
Neg logit contributions
ust-0.041
irk-0.034
oker-0.033
 possible-0.032
 forward-0.032
Token the
Token ID262
Feature activation+3.57e+00

Pos logit contributions
ust+0.058
irk+0.051
oker+0.048
aring+0.047
 loser+0.046
Neg logit contributions
lf-0.034
 then-0.032
 cardboard-0.028
 airplane-0.028
 set-0.027
Token sour
Token ID11348
Feature activation-6.29e-02

Pos logit contributions
lf+0.088
 airplane+0.072
 cardboard+0.070
 parked+0.070
 Year+0.070
Neg logit contributions
ust-0.121
irk-0.102
oker-0.098
 forward-0.095
 possible-0.094
Token fruits
Token ID15921
Feature activation-1.34e+00

Pos logit contributions
ust+0.002
irk+0.002
 forward+0.002
oker+0.002
 possible+0.002
Neg logit contributions
lf-0.001
 airplane-0.001
 then-0.001
 set-0.001
 walls-0.001
Token couldn
Token ID3521
Feature activation+9.53e+00

Pos logit contributions
ust+0.054
irk+0.048
oker+0.045
osa+0.045
 forward+0.044
Neg logit contributions
lf-0.025
 then-0.019
 parked-0.019
 set-0.019
 airplane-0.018
Token't
Token ID470
Feature activation-2.28e+00

Pos logit contributions
lf+0.159
 airplane+0.115
 parked+0.113
 then+0.112
 cardboard+0.112
Neg logit contributions
ust-0.399
irk-0.363
oker-0.338
 forward-0.333
 mind-0.329
Token ruin
Token ID16866
Feature activation-4.29e+00

Pos logit contributions
ust+0.081
irk+0.074
oker+0.068
apping+0.068
 forward+0.067
Neg logit contributions
lf-0.053
 set-0.047
 then-0.046
 store-0.043
 airplane-0.043
Token their
Token ID511
Feature activation-3.12e+00

Pos logit contributions
ust+0.167
irk+0.156
oker+0.145
apping+0.143
umpy+0.142
Neg logit contributions
lf-0.085
 cardboard-0.077
 walls-0.072
 then-0.071
 airplane-0.068
Token fun
Token ID1257
Feature activation-4.27e+00

Pos logit contributions
ust+0.094
irk+0.079
oker+0.075
 forward+0.072
aring+0.070
Neg logit contributions
lf-0.089
 airplane-0.076
 cardboard-0.074
 Year-0.073
 walls-0.073
Token!
Token ID0
Feature activation-8.92e-01

Pos logit contributions
ust+0.145
irk+0.125
oker+0.114
 forward+0.113
olding+0.113
Neg logit contributions
lf-0.112
 airplane-0.092
 cardboard-0.089
 walls-0.088
 set-0.088
Token.
Token ID13
Feature activation+1.32e+00

Pos logit contributions
lf+0.074
 then+0.066
 airplane+0.066
 walls+0.061
 cardboard+0.060
Neg logit contributions
ust-0.156
irk-0.146
oker-0.145
 loser-0.138
ashes-0.135
Token She
Token ID1375
Feature activation+4.57e+00

Pos logit contributions
lf+0.036
 airplane+0.030
 parked+0.030
 Year+0.030
 cardboard+0.029
Neg logit contributions
ust-0.040
irk-0.033
oker-0.032
 forward-0.031
 mind-0.031
Token was
Token ID373
Feature activation+3.01e+00

Pos logit contributions
lf+0.096
 airplane+0.074
 Year+0.072
 cardboard+0.071
 parked+0.071
Neg logit contributions
ust-0.169
irk-0.145
oker-0.140
 mind-0.139
 forward-0.138
Token deaf
Token ID21815
Feature activation+1.83e+00

Pos logit contributions
lf+0.079
 parked+0.068
 then+0.067
 airplane+0.065
 set+0.063
Neg logit contributions
ust-0.099
irk-0.085
oker-0.083
ashes-0.076
aring-0.075
Token and
Token ID290
Feature activation+2.00e+00

Pos logit contributions
lf+0.039
 airplane+0.033
 then+0.033
 walls+0.031
 parked+0.030
Neg logit contributions
ust-0.068
oker-0.058
irk-0.057
 possible-0.054
rys-0.054
Token couldn
Token ID3521
Feature activation+9.52e+00

Pos logit contributions
lf+0.052
 then+0.044
 parked+0.043
 airplane+0.042
 set+0.041
Neg logit contributions
ust-0.066
irk-0.057
oker-0.054
ashes-0.052
 loser-0.051
Token't
Token ID470
Feature activation-2.23e+00

Pos logit contributions
lf+0.143
 parked+0.101
 then+0.097
 airplane+0.097
 Year+0.092
Neg logit contributions
ust-0.419
irk-0.383
oker-0.358
ashing-0.350
 forward-0.347
Token hear
Token ID3285
Feature activation+3.21e+00

Pos logit contributions
ust+0.076
irk+0.068
oker+0.066
ashing+0.062
elly+0.062
Neg logit contributions
lf-0.054
 then-0.050
 set-0.047
 airplane-0.047
 parked-0.045
Token very
Token ID845
Feature activation-4.68e+00

Pos logit contributions
lf+0.080
 walls+0.071
 then+0.069
 airplane+0.069
 cardboard+0.066
Neg logit contributions
ust-0.109
irk-0.096
oker-0.094
ashing-0.087
umpy-0.086
Token well
Token ID880
Feature activation-5.58e+00

Pos logit contributions
ust+0.143
irk+0.128
oker+0.116
beh+0.113
rav+0.110
Neg logit contributions
lf-0.116
 then-0.115
 walls-0.110
 airplane-0.101
 top-0.100
Token.
Token ID13
Feature activation-1.15e+00

Pos logit contributions
ust+0.202
irk+0.184
oker+0.171
 Deal+0.167
rav+0.166
Neg logit contributions
lf-0.124
 then-0.106
 set-0.101
 walls-0.101
 airplane-0.100
Token and
Token ID290
Feature activation+3.90e+00

Pos logit contributions
lf+0.070
 airplane+0.060
 parked+0.058
 then+0.058
 set+0.058
Neg logit contributions
ust-0.078
irk-0.066
oker-0.061
osa-0.060
rys-0.058
Token again
Token ID757
Feature activation-3.58e+00

Pos logit contributions
lf+0.092
 airplane+0.075
 parked+0.074
 then+0.074
 cardboard+0.072
Neg logit contributions
ust-0.139
irk-0.117
oker-0.112
 forward-0.109
aring-0.108
Token,
Token ID11
Feature activation+2.87e+00

Pos logit contributions
ust+0.121
irk+0.107
oker+0.099
olding+0.096
 forward+0.095
Neg logit contributions
lf-0.089
 then-0.078
 set-0.075
 airplane-0.072
 cardboard-0.071
Token but
Token ID475
Feature activation+1.42e+00

Pos logit contributions
lf+0.070
 then+0.062
 airplane+0.059
 parked+0.058
 set+0.057
Neg logit contributions
ust-0.101
irk-0.087
oker-0.082
aring-0.078
umpy-0.078
Token it
Token ID340
Feature activation-1.96e-01

Pos logit contributions
lf+0.031
 then+0.028
 set+0.024
 parked+0.024
 airplane+0.024
Neg logit contributions
ust-0.053
irk-0.049
umpy-0.045
oker-0.045
ashes-0.045
Token couldn
Token ID3521
Feature activation+9.51e+00

Pos logit contributions
ust+0.009
irk+0.008
oker+0.008
 forward+0.008
olding+0.007
Neg logit contributions
lf-0.003
 then-0.002
 parked-0.002
 set-0.002
 airplane-0.002
Token't
Token ID470
Feature activation-2.12e+00

Pos logit contributions
lf+0.180
 airplane+0.136
 parked+0.133
 Year+0.130
 cardboard+0.130
Neg logit contributions
ust-0.378
irk-0.335
oker-0.317
 forward-0.311
 mind-0.309
Token make
Token ID787
Feature activation+1.26e+00

Pos logit contributions
ust+0.073
irk+0.066
oker+0.062
 forward+0.059
osa+0.059
Neg logit contributions
lf-0.052
 then-0.043
 set-0.043
 airplane-0.042
 parked-0.042
Token the
Token ID262
Feature activation+2.36e+00

Pos logit contributions
lf+0.024
 airplane+0.024
 cardboard+0.023
 walls+0.022
 engine+0.021
Neg logit contributions
ust-0.047
irk-0.044
oker-0.042
olding-0.041
ashing-0.041
Token nest
Token ID16343
Feature activation+1.83e+00

Pos logit contributions
lf+0.059
 airplane+0.049
 cardboard+0.047
 parked+0.047
 Year+0.047
Neg logit contributions
ust-0.080
irk-0.067
oker-0.064
 forward-0.063
 mind-0.062
Token stay
Token ID2652
Feature activation-3.40e+00

Pos logit contributions
lf+0.050
 airplane+0.045
 set+0.043
 parked+0.042
 walls+0.042
Neg logit contributions
ust-0.055
irk-0.050
oker-0.045
osa-0.045
 loser-0.043
Token that
Token ID326
Feature activation+1.52e+00

Pos logit contributions
lf+0.036
 Year+0.029
 cardboard+0.027
 walls+0.025
 blasted+0.024
Neg logit contributions
ust-0.077
 possible-0.071
irk-0.069
 forward-0.069
 than-0.067
Token day
Token ID1110
Feature activation+1.53e+00

Pos logit contributions
lf+0.038
 airplane+0.031
 cardboard+0.030
 parked+0.030
 then+0.030
Neg logit contributions
ust-0.052
irk-0.043
oker-0.041
ashes-0.040
 mind-0.040
Token on
Token ID319
Feature activation-3.66e+00

Pos logit contributions
lf+0.036
 then+0.031
 parked+0.030
 airplane+0.029
pped+0.029
Neg logit contributions
ust-0.054
irk-0.046
ashes-0.044
oker-0.042
umble-0.042
Token,
Token ID11
Feature activation+1.04e+00

Pos logit contributions
ust+0.164
irk+0.141
ashes+0.135
ashing+0.131
olding+0.131
Neg logit contributions
lf-0.066
 then-0.048
 cardboard-0.046
 airplane-0.041
 store-0.040
Token Cr
Token ID3864
Feature activation-5.12e+00

Pos logit contributions
lf+0.021
 then+0.019
 set+0.018
 store+0.018
 cardboard+0.017
Neg logit contributions
ust-0.038
irk-0.035
olding-0.032
istered-0.032
umpy-0.031
Tokenust
Token ID436
Feature activation-1.42e+01

Pos logit contributions
ust+0.018
 forward+0.014
 mind+0.014
irk+0.011
 possible+0.010
Neg logit contributions
lf-0.238
 airplane-0.218
 parked-0.216
 set-0.214
 cardboard-0.213
Tokeny
Token ID88
Feature activation+1.02e+00

Pos logit contributions
ust+0.333
irk+0.293
oker+0.273
 forward+0.265
 mind+0.251
Neg logit contributions
lf-0.449
 then-0.391
 parked-0.390
 set-0.389
 airplane-0.388
Token tried
Token ID3088
Feature activation+4.89e+00

Pos logit contributions
lf+0.020
 airplane+0.015
 parked+0.015
 Year+0.015
 cardboard+0.015
Neg logit contributions
ust-0.039
irk-0.034
oker-0.033
 mind-0.033
 forward-0.033
Token his
Token ID465
Feature activation-3.14e+00

Pos logit contributions
 then+0.092
lf+0.088
 cardboard+0.083
 walls+0.081
 set+0.080
Neg logit contributions
ust-0.184
oker-0.168
irk-0.160
umpy-0.155
istered-0.146
Token best
Token ID1266
Feature activation-7.14e+00

Pos logit contributions
ust+0.080
irk+0.065
oker+0.062
umble+0.056
apping+0.056
Neg logit contributions
lf-0.103
 airplane-0.091
 cardboard-0.089
 then-0.088
 walls-0.088
Token to
Token ID284
Feature activation+1.04e+00

Pos logit contributions
ust+0.233
oker+0.197
irk+0.196
istered+0.180
umpy+0.179
Neg logit contributions
lf-0.170
 then-0.154
 set-0.152
 airplane-0.152
 engine-0.146
Token.
Token ID13
Feature activation-1.59e+00

Pos logit contributions
ust+0.104
irk+0.086
oker+0.082
 mind+0.078
 forward+0.077
Neg logit contributions
lf-0.099
 airplane-0.085
 parked-0.084
 cardboard-0.082
 then-0.082
Token One
Token ID1881
Feature activation-3.09e-02

Pos logit contributions
ust+0.041
irk+0.033
 mind+0.032
 forward+0.031
oker+0.031
Neg logit contributions
lf-0.054
 Year-0.044
 airplane-0.044
 parked-0.042
cut-0.041
Token day
Token ID1110
Feature activation+2.01e+00

Pos logit contributions
ust+0.001
irk+0.001
oker+0.001
 forward+0.001
 possible+0.001
Neg logit contributions
lf-0.001
 parked-0.001
 airplane-0.001
 set-0.001
 then-0.001
Token,
Token ID11
Feature activation+1.16e+00

Pos logit contributions
lf+0.053
 airplane+0.044
 Year+0.043
 parked+0.042
 cardboard+0.041
Neg logit contributions
ust-0.064
irk-0.053
 forward-0.052
oker-0.052
 mind-0.052
Token Cr
Token ID3864
Feature activation-4.88e+00

Pos logit contributions
lf+0.042
 Year+0.037
 airplane+0.034
 couch+0.033
cut+0.033
Neg logit contributions
ust-0.024
 forward-0.022
 mind-0.021
 possible-0.020
 loser-0.019
Tokenust
Token ID436
Feature activation-1.39e+01

Pos logit contributions
ust+0.025
irk+0.010
 forward+0.007
 mind+0.006
oker+0.006
Neg logit contributions
lf-0.241
 airplane-0.220
 parked-0.218
 cardboard-0.216
 set-0.215
Tokeny
Token ID88
Feature activation+1.51e+00

Pos logit contributions
ust+0.298
irk+0.219
oker+0.206
 forward+0.199
 mind+0.199
Neg logit contributions
lf-0.519
 airplane-0.451
 Year-0.447
 parked-0.445
 cardboard-0.443
Token found
Token ID1043
Feature activation+5.06e+00

Pos logit contributions
lf+0.035
 Year+0.029
 cardboard+0.029
cut+0.029
 airplane+0.028
Neg logit contributions
ust-0.048
umpy-0.042
irk-0.041
 mind-0.040
oker-0.040
Token a
Token ID257
Feature activation+3.22e+00

Pos logit contributions
lf+0.166
 Year+0.140
 airplane+0.125
 parked+0.122
 couch+0.120
Neg logit contributions
ust-0.136
 possible-0.113
irk-0.111
 forward-0.110
 mind-0.110
Token shiny
Token ID22441
Feature activation+1.60e+00

Pos logit contributions
lf+0.085
 airplane+0.072
 cardboard+0.070
 parked+0.068
 Year+0.067
Neg logit contributions
ust-0.104
irk-0.087
oker-0.083
 forward-0.081
aring-0.079
Token shell
Token ID7582
Feature activation+1.56e+00

Pos logit contributions
lf+0.040
 Year+0.034
 parked+0.033
ons+0.031
 then+0.030
Neg logit contributions
ust-0.049
 mind-0.043
irk-0.042
 possible-0.041
oker-0.041
Token Cr
Token ID3864
Feature activation-3.99e+00

Pos logit contributions
lf+0.069
 Year+0.058
 airplane+0.058
 parked+0.055
cut+0.054
Neg logit contributions
ust-0.071
 mind-0.057
 possible-0.057
irk-0.057
oker-0.056
Tokenust
Token ID436
Feature activation-1.34e+01

Pos logit contributions
ust+0.039
irk+0.027
 mind+0.025
 forward+0.025
 possible+0.024
Neg logit contributions
lf-0.178
 airplane-0.160
 parked-0.159
 cardboard-0.156
 set-0.156
Tokeny
Token ID88
Feature activation+1.79e+00

Pos logit contributions
ust+0.283
irk+0.210
oker+0.197
 forward+0.190
 mind+0.189
Neg logit contributions
lf-0.503
 airplane-0.438
 Year-0.433
 parked-0.433
 cardboard-0.430
Token.
Token ID13
Feature activation-1.61e+00

Pos logit contributions
lf+0.062
 Year+0.055
 airplane+0.054
 cardboard+0.054
 parked+0.053
Neg logit contributions
ust-0.040
irk-0.031
 mind-0.029
oker-0.028
aring-0.028
Token Cr
Token ID3864
Feature activation-4.59e+00

Pos logit contributions
ust+0.035
irk+0.028
 forward+0.027
 mind+0.026
oker+0.025
Neg logit contributions
lf-0.061
 Year-0.053
 airplane-0.050
 parked-0.049
cut-0.047
Tokenust
Token ID436
Feature activation-1.37e+01

Pos logit contributions
ust+0.022
irk+0.005
 forward+0.003
 mind+0.002
oker+0.002
Neg logit contributions
lf-0.233
 airplane-0.212
 parked-0.210
 cardboard-0.208
 Year-0.207
Tokeny
Token ID88
Feature activation+1.59e+00

Pos logit contributions
ust+0.305
irk+0.231
oker+0.216
 forward+0.209
 mind+0.209
Neg logit contributions
lf-0.499
 airplane-0.433
 Year-0.428
 parked-0.428
 cardboard-0.424
Token lived
Token ID5615
Feature activation+3.17e+00

Pos logit contributions
lf+0.032
 Year+0.029
 airplane+0.028
 cardboard+0.026
 checkout+0.026
Neg logit contributions
ust-0.055
 mind-0.048
 than-0.047
umpy-0.046
irk-0.046
Token in
Token ID287
Feature activation-3.22e+00

Pos logit contributions
lf+0.090
 airplane+0.075
 parked+0.075
 cardboard+0.073
 Year+0.073
Neg logit contributions
ust-0.095
irk-0.078
oker-0.075
 mind-0.072
 possible-0.072
Token a
Token ID257
Feature activation+3.21e+00

Pos logit contributions
ust+0.090
 mind+0.078
irk+0.076
 possible+0.076
 common+0.074
Neg logit contributions
lf-0.092
 Year-0.074
 airplane-0.074
cut-0.072
 parked-0.071
Token big
Token ID1263
Feature activation+2.49e+00

Pos logit contributions
lf+0.102
 Year+0.088
 parked+0.087
 airplane+0.086
cut+0.084
Neg logit contributions
ust-0.081
irk-0.065
 mind-0.064
oker-0.063
 forward-0.062
Token answer
Token ID3280
Feature activation-2.10e+00

Pos logit contributions
ust+0.236
oker+0.179
 Deal+0.178
umpy+0.176
beh+0.175
Neg logit contributions
 set-0.138
lf-0.132
 airplane-0.130
 cardboard-0.128
 colored-0.125
Token for
Token ID329
Feature activation-4.83e+00

Pos logit contributions
ust+0.065
irk+0.055
oker+0.053
ashes+0.050
rys+0.049
Neg logit contributions
lf-0.058
 airplane-0.049
 parked-0.049
 then-0.049
 set-0.047
Token him
Token ID683
Feature activation-2.83e+00

Pos logit contributions
ust+0.169
irk+0.143
oker+0.138
 forward+0.135
 mind+0.135
Neg logit contributions
lf-0.114
 airplane-0.091
 parked-0.089
 Year-0.089
 cardboard-0.088
Token.
Token ID13
Feature activation-1.64e+00

Pos logit contributions
ust+0.089
irk+0.076
oker+0.074
rys+0.068
 loser+0.068
Neg logit contributions
lf-0.074
 then-0.064
 airplane-0.064
 parked-0.063
 set-0.061
Token Fr
Token ID1305
Feature activation-3.72e+00

Pos logit contributions
ust+0.061
 possible+0.051
irk+0.050
oker+0.049
rys+0.049
Neg logit contributions
lf-0.034
 parked-0.029
 then-0.029
 store-0.027
 Year-0.027
Tokenust
Token ID436
Feature activation-1.36e+01

Pos logit contributions
ust+0.051
irk+0.032
oker+0.028
 forward+0.026
 mind+0.026
Neg logit contributions
lf-0.166
 airplane-0.148
 parked-0.147
 Year-0.146
 cardboard-0.145
Tokenrated
Token ID4111
Feature activation+2.10e+00

Pos logit contributions
ust+0.351
irk+0.299
oker+0.282
 forward+0.269
 mind+0.264
Neg logit contributions
lf-0.427
 airplane-0.368
 parked-0.364
 then-0.360
 set-0.355
Token,
Token ID11
Feature activation+9.44e-01

Pos logit contributions
lf+0.042
 airplane+0.034
 then+0.033
 cardboard+0.032
 set+0.032
Neg logit contributions
ust-0.082
irk-0.070
oker-0.068
 possible-0.065
 mind-0.065
Token he
Token ID339
Feature activation-1.21e+00

Pos logit contributions
lf+0.021
 then+0.018
 parked+0.017
 airplane+0.016
 set+0.016
Neg logit contributions
ust-0.037
irk-0.032
aring-0.030
umpy-0.030
oker-0.030
Token decided
Token ID3066
Feature activation+6.96e+00

Pos logit contributions
ust+0.046
irk+0.040
oker+0.039
 forward+0.038
 mind+0.038
Neg logit contributions
lf-0.024
 airplane-0.018
 Year-0.018
 parked-0.018
 cardboard-0.018
Token to
Token ID284
Feature activation+1.21e+00

Pos logit contributions
lf+0.187
 then+0.162
 airplane+0.161
 parked+0.160
 set+0.154
Neg logit contributions
ust-0.224
irk-0.189
oker-0.184
ashes-0.172
umpy-0.170
Token back
Token ID736
Feature activation-5.76e+00

Pos logit contributions
irk+0.040
ust+0.040
oker+0.037
apping+0.036
rys+0.035
Neg logit contributions
 set-0.025
 airplane-0.023
 then-0.023
 store-0.022
 parked-0.022
Token.
Token ID13
Feature activation-1.61e+00

Pos logit contributions
ust+0.197
irk+0.181
oker+0.173
rys+0.164
 loser+0.159
Neg logit contributions
lf-0.126
 then-0.122
 set-0.110
 parked-0.108
 store-0.106
Token
Token ID198
Feature activation-2.28e+00

Pos logit contributions
ust+0.064
oker+0.054
irk+0.053
 possible+0.053
 unnecessary+0.052
Neg logit contributions
lf-0.027
 then-0.024
 Year-0.022
 store-0.022
 walls-0.021
Token
Token ID198
Feature activation-2.32e+00

Pos logit contributions
ust+0.076
irk+0.063
oker+0.061
 forward+0.060
 mind+0.060
Neg logit contributions
lf-0.058
 parked-0.046
 airplane-0.046
 Year-0.046
 cardboard-0.045
TokenFr
Token ID6732
Feature activation-5.34e+00

Pos logit contributions
ust+0.087
irk+0.078
oker+0.074
 mind+0.074
 forward+0.073
Neg logit contributions
lf-0.047
 parked-0.035
 then-0.034
 Year-0.034
 airplane-0.033
Tokenust
Token ID436
Feature activation-1.36e+01

Pos logit contributions
ust+0.008
irk+0.000
 forward-0.007
 mind-0.009
oker-0.009
Neg logit contributions
lf-0.286
 airplane-0.256
 parked-0.254
 then-0.252
 Year-0.251
Tokenrated
Token ID4111
Feature activation+2.11e+00

Pos logit contributions
ust+0.367
irk+0.326
oker+0.301
 forward+0.287
 mind+0.283
Neg logit contributions
lf-0.396
 then-0.346
 airplane-0.338
 parked-0.337
 set-0.336
Token,
Token ID11
Feature activation+1.14e+00

Pos logit contributions
lf+0.037
 then+0.031
 set+0.029
 airplane+0.028
 walls+0.028
Neg logit contributions
ust-0.087
irk-0.076
oker-0.073
 possible-0.070
ashes-0.070
Token Jane
Token ID12091
Feature activation-2.61e+00

Pos logit contributions
lf+0.028
 then+0.027
 walls+0.023
 set+0.023
 store+0.022
Neg logit contributions
ust-0.041
irk-0.036
aring-0.035
ashing-0.034
ashes-0.034
Token went
Token ID1816
Feature activation+1.71e-01

Pos logit contributions
ust+0.097
irk+0.082
 forward+0.080
 mind+0.079
oker+0.079
Neg logit contributions
lf-0.055
 Year-0.042
 airplane-0.042
 cardboard-0.042
 parked-0.041
Token back
Token ID736
Feature activation-5.73e+00

Pos logit contributions
lf+0.004
 store+0.004
 then+0.004
 parked+0.004
 set+0.004
Neg logit contributions
ust-0.006
oker-0.005
irk-0.005
umble-0.005
 loser-0.004
Token stick
Token ID4859
Feature activation+1.75e+00

Pos logit contributions
ust+0.109
irk+0.099
osa+0.092
oker+0.092
 forward+0.088
Neg logit contributions
lf-0.080
 set-0.068
 parked-0.066
 cardboard-0.065
 airplane-0.065
Token.
Token ID13
Feature activation-1.51e+00

Pos logit contributions
lf+0.038
 set+0.036
 cardboard+0.035
 then+0.034
 parked+0.034
Neg logit contributions
ust-0.060
irk-0.057
osa-0.054
oker-0.051
 Deal-0.049
Token
Token ID198
Feature activation-2.21e+00

Pos logit contributions
ust+0.042
irk+0.034
 mind+0.033
 forward+0.032
oker+0.032
Neg logit contributions
lf-0.047
 airplane-0.041
 parked-0.039
 Year-0.038
 cardboard-0.038
Token
Token ID198
Feature activation-2.26e+00

Pos logit contributions
ust+0.059
irk+0.047
 mind+0.043
 forward+0.043
oker+0.043
Neg logit contributions
lf-0.069
 airplane-0.065
 Year-0.060
 cardboard-0.060
 parked-0.059
TokenFr
Token ID6732
Feature activation-5.39e+00

Pos logit contributions
ust+0.082
umpy+0.064
aring+0.063
 forward+0.061
olding+0.061
Neg logit contributions
lf-0.050
 airplane-0.049
 Year-0.046
 fridge-0.045
 cardboard-0.045
Tokenust
Token ID436
Feature activation-1.35e+01

Pos logit contributions
ust+0.029
irk+0.005
oker-0.001
 forward-0.003
 mind-0.004
Neg logit contributions
lf-0.282
 airplane-0.256
 parked-0.254
 Year-0.252
 cardboard-0.252
Tokenrated
Token ID4111
Feature activation+2.26e+00

Pos logit contributions
ust+0.331
irk+0.261
oker+0.246
 forward+0.239
 mind+0.237
Neg logit contributions
lf-0.459
 airplane-0.394
 parked-0.390
 Year-0.388
 cardboard-0.385
Token,
Token ID11
Feature activation+1.03e+00

Pos logit contributions
lf+0.048
 airplane+0.038
 cardboard+0.037
 then+0.037
 parked+0.037
Neg logit contributions
ust-0.085
irk-0.073
oker-0.071
 possible-0.069
 mind-0.068
Token the
Token ID262
Feature activation+1.87e+00

Pos logit contributions
lf+0.024
 airplane+0.019
 cardboard+0.019
 parked+0.019
 Year+0.018
Neg logit contributions
ust-0.037
irk-0.031
oker-0.030
 forward-0.029
 possible-0.029
Token lion
Token ID18744
Feature activation+3.38e+00

Pos logit contributions
lf+0.045
 Year+0.043
bed+0.041
esh+0.041
pped+0.040
Neg logit contributions
 mind-0.052
 loser-0.050
ust-0.049
 good-0.047
 possible-0.046
Token went
Token ID1816
Feature activation+1.59e-01

Pos logit contributions
lf+0.074
 Year+0.062
 cardboard+0.058
 airplane+0.057
 fridge+0.056
Neg logit contributions
ust-0.120
irk-0.099
 mind-0.098
 forward-0.097
oker-0.094
Token or
Token ID393
Feature activation+7.67e-01

Pos logit contributions
lf+0.016
 airplane+0.013
 Year+0.013
 parked+0.012
 cardboard+0.012
Neg logit contributions
ust-0.020
irk-0.017
 forward-0.016
oker-0.016
 possible-0.016
Token too
Token ID1165
Feature activation-9.06e+00

Pos logit contributions
lf+0.021
 then+0.019
 parked+0.018
 set+0.018
 airplane+0.018
Neg logit contributions
ust-0.023
irk-0.021
oker-0.020
umpy-0.018
apping-0.018
Token hard
Token ID1327
Feature activation-5.75e-01

Pos logit contributions
ust+0.284
irk+0.258
oker+0.241
 Deal+0.231
 loser+0.228
Neg logit contributions
lf-0.235
 parked-0.200
 set-0.196
 cardboard-0.191
 then-0.190
Token.
Token ID13
Feature activation-1.78e+00

Pos logit contributions
ust+0.017
irk+0.014
oker+0.014
 loser+0.013
 mind+0.013
Neg logit contributions
lf-0.016
 airplane-0.014
 cardboard-0.014
 then-0.014
 set-0.014
Token Fr
Token ID1305
Feature activation-3.61e+00

Pos logit contributions
ust+0.054
irk+0.044
 mind+0.043
 forward+0.042
oker+0.042
Neg logit contributions
lf-0.051
 airplane-0.043
 parked-0.042
 Year-0.041
 cardboard-0.040
Tokenust
Token ID436
Feature activation-1.35e+01

Pos logit contributions
ust+0.059
irk+0.045
oker+0.041
 mind+0.040
 forward+0.040
Neg logit contributions
lf-0.148
 airplane-0.130
 parked-0.129
 Year-0.127
 cardboard-0.125
Tokenrated
Token ID4111
Feature activation+2.20e+00

Pos logit contributions
ust+0.335
irk+0.270
oker+0.255
 forward+0.246
 mind+0.242
Neg logit contributions
lf-0.447
 airplane-0.385
 parked-0.381
 cardboard-0.375
 Year-0.374
Token,
Token ID11
Feature activation+7.90e-01

Pos logit contributions
lf+0.047
 airplane+0.038
 then+0.036
 cardboard+0.036
 Year+0.036
Neg logit contributions
ust-0.082
irk-0.071
oker-0.069
 possible-0.066
 mind-0.066
Token she
Token ID673
Feature activation-1.81e+00

Pos logit contributions
lf+0.020
 airplane+0.017
 then+0.016
 cardboard+0.016
 walls+0.016
Neg logit contributions
ust-0.027
irk-0.023
oker-0.022
aring-0.021
olding-0.021
Token kept
Token ID4030
Feature activation+7.92e-01

Pos logit contributions
ust+0.058
 mind+0.046
 forward+0.044
irm+0.043
ks+0.041
Neg logit contributions
lf-0.046
 Year-0.043
 cardboard-0.038
 fridge-0.037
 couch-0.036
Token searching
Token ID10342
Feature activation-1.97e+00

Pos logit contributions
lf+0.019
 airplane+0.016
 parked+0.016
 then+0.015
 cardboard+0.015
Neg logit contributions
ust-0.028
irk-0.023
oker-0.023
umpy-0.021
 mind-0.021
Token was
Token ID373
Feature activation+1.82e+00

Pos logit contributions
ust+0.049
irk+0.043
oker+0.042
 forward+0.041
 mind+0.041
Neg logit contributions
lf-0.015
 airplane-0.010
 parked-0.009
 Year-0.009
 cardboard-0.009
Token too
Token ID1165
Feature activation-8.90e+00

Pos logit contributions
lf+0.044
 set+0.043
 then+0.043
 coal+0.041
 parked+0.041
Neg logit contributions
irk-0.059
ust-0.056
oker-0.056
ashes-0.054
apping-0.053
Token dangerous
Token ID4923
Feature activation+1.74e-01

Pos logit contributions
ust+0.280
irk+0.251
oker+0.240
 loser+0.236
 forward+0.235
Neg logit contributions
lf-0.214
 parked-0.194
 coal-0.193
 cardboard-0.191
 set-0.188
Token.
Token ID13
Feature activation-1.46e+00

Pos logit contributions
lf+0.004
 then+0.004
 airplane+0.004
 parked+0.004
 set+0.004
Neg logit contributions
ust-0.006
irk-0.005
oker-0.005
rys-0.005
 loser-0.005
Token Fr
Token ID1305
Feature activation-3.55e+00

Pos logit contributions
ust+0.046
irk+0.037
 possible+0.037
oker+0.036
 forward+0.035
Neg logit contributions
lf-0.040
 airplane-0.033
 cardboard-0.032
 parked-0.032
 fridge-0.032
Tokenust
Token ID436
Feature activation-1.34e+01

Pos logit contributions
ust+0.039
irk+0.021
oker+0.017
 forward+0.015
 mind+0.015
Neg logit contributions
lf-0.168
 airplane-0.151
 parked-0.150
 Year-0.149
 cardboard-0.149
Tokenrated
Token ID4111
Feature activation+2.23e+00

Pos logit contributions
ust+0.343
irk+0.285
oker+0.269
 forward+0.260
 mind+0.255
Neg logit contributions
lf-0.429
 airplane-0.370
 parked-0.366
 then-0.358
 cardboard-0.355
Token,
Token ID11
Feature activation+1.14e+00

Pos logit contributions
lf+0.040
 then+0.033
 airplane+0.032
 walls+0.030
 set+0.030
Neg logit contributions
ust-0.092
irk-0.079
oker-0.078
 possible-0.075
olding-0.074
Token Sam
Token ID3409
Feature activation+2.05e+00

Pos logit contributions
lf+0.030
 then+0.025
 airplane+0.024
 cardboard+0.023
 set+0.023
Neg logit contributions
ust-0.039
irk-0.032
aring-0.031
ashes-0.031
oker-0.031
Token turned
Token ID2900
Feature activation+6.12e-01

Pos logit contributions
lf+0.044
 Year+0.034
 airplane+0.034
 cardboard+0.034
 parked+0.033
Neg logit contributions
ust-0.076
irk-0.065
 mind-0.062
oker-0.062
 forward-0.062
Token away
Token ID1497
Feature activation-2.17e+00

Pos logit contributions
lf+0.014
 then+0.013
 airplane+0.013
 parked+0.013
 set+0.012
Neg logit contributions
ust-0.022
oker-0.018
 loser-0.017
irk-0.017
osa-0.017
Token a
Token ID257
Feature activation+3.23e+00

Pos logit contributions
lf+0.069
cut+0.061
 Year+0.056
rus+0.056
 airplane+0.056
Neg logit contributions
ust-0.046
 OK-0.037
 mind-0.036
 forward-0.036
 much-0.036
Token little
Token ID1310
Feature activation+3.90e+00

Pos logit contributions
lf+0.088
 Year+0.073
 parked+0.073
 airplane+0.072
ons+0.071
Neg logit contributions
ust-0.096
irk-0.081
 mind-0.079
oker-0.077
 forward-0.076
Token crab
Token ID32202
Feature activation+1.47e+00

Pos logit contributions
lf+0.098
 airplane+0.081
 cardboard+0.078
 Year+0.077
 parked+0.077
Neg logit contributions
ust-0.132
irk-0.109
oker-0.106
 possible-0.104
 forward-0.104
Token named
Token ID3706
Feature activation+2.43e+00

Pos logit contributions
lf+0.030
 Year+0.024
 airplane+0.023
 parked+0.023
 walls+0.023
Neg logit contributions
ust-0.055
irk-0.047
 mind-0.045
oker-0.045
aring-0.045
Token Cr
Token ID3864
Feature activation-3.99e+00

Pos logit contributions
lf+0.069
 Year+0.058
 airplane+0.058
 parked+0.055
cut+0.054
Neg logit contributions
ust-0.071
 mind-0.057
 possible-0.057
irk-0.057
oker-0.056
Tokenust
Token ID436
Feature activation-1.34e+01

Pos logit contributions
ust+0.039
irk+0.027
 mind+0.025
 forward+0.025
 possible+0.024
Neg logit contributions
lf-0.178
 airplane-0.160
 parked-0.159
 cardboard-0.156
 set-0.156
Tokeny
Token ID88
Feature activation+1.79e+00

Pos logit contributions
ust+0.283
irk+0.210
oker+0.197
 forward+0.190
 mind+0.189
Neg logit contributions
lf-0.503
 airplane-0.438
 Year-0.433
 parked-0.433
 cardboard-0.430
Token.
Token ID13
Feature activation-1.61e+00

Pos logit contributions
lf+0.062
 Year+0.055
 airplane+0.054
 cardboard+0.054
 parked+0.053
Neg logit contributions
ust-0.040
irk-0.031
 mind-0.029
oker-0.028
aring-0.028
Token Cr
Token ID3864
Feature activation-4.59e+00

Pos logit contributions
ust+0.035
irk+0.028
 forward+0.027
 mind+0.026
oker+0.025
Neg logit contributions
lf-0.061
 Year-0.053
 airplane-0.050
 parked-0.049
cut-0.047
Tokenust
Token ID436
Feature activation-1.37e+01

Pos logit contributions
ust+0.022
irk+0.005
 forward+0.003
 mind+0.002
oker+0.002
Neg logit contributions
lf-0.233
 airplane-0.212
 parked-0.210
 cardboard-0.208
 Year-0.207
Tokeny
Token ID88
Feature activation+1.59e+00

Pos logit contributions
ust+0.305
irk+0.231
oker+0.216
 forward+0.209
 mind+0.209
Neg logit contributions
lf-0.499
 airplane-0.433
 Year-0.428
 parked-0.428
 cardboard-0.424
Token but
Token ID475
Feature activation+8.57e-01

Pos logit contributions
lf+0.042
 Year+0.035
 airplane+0.035
 parked+0.034
 cardboard+0.034
Neg logit contributions
ust-0.029
irk-0.023
 mind-0.022
 forward-0.022
 possible-0.021
Token couldn
Token ID3521
Feature activation+8.92e+00

Pos logit contributions
lf+0.021
 Year+0.018
 airplane+0.017
 cardboard+0.016
 parked+0.016
Neg logit contributions
ust-0.028
 forward-0.024
 mind-0.024
 possible-0.023
irk-0.023
Token't
Token ID470
Feature activation-2.93e+00

Pos logit contributions
lf+0.236
 airplane+0.207
 Year+0.206
 cardboard+0.197
cut+0.192
Neg logit contributions
ust-0.263
 possible-0.217
 forward-0.213
oker-0.209
 loser-0.202
Token.
Token ID13
Feature activation-1.50e+00

Pos logit contributions
ust+0.090
irk+0.074
oker+0.071
 forward+0.070
 mind+0.069
Neg logit contributions
lf-0.082
 airplane-0.068
 parked-0.067
 Year-0.066
 cardboard-0.066
Token Fr
Token ID1305
Feature activation-3.31e+00

Pos logit contributions
ust+0.037
 mind+0.030
irk+0.028
 forward+0.027
oker+0.026
Neg logit contributions
lf-0.051
 airplane-0.043
 Year-0.043
 parked-0.041
ons-0.041
Tokenust
Token ID436
Feature activation-1.33e+01

Pos logit contributions
ust+0.031
irk+0.008
oker+0.005
 possible+0.002
umpy+0.002
Neg logit contributions
lf-0.166
 Year-0.152
 airplane-0.152
 parked-0.151
 cardboard-0.151
Tokenrated
Token ID4111
Feature activation+2.33e+00

Pos logit contributions
ust+0.320
irk+0.236
 possible+0.225
 mind+0.223
oker+0.223
Neg logit contributions
lf-0.470
 Year-0.409
 airplane-0.403
 cardboard-0.399
 parked-0.396
Token,
Token ID11
Feature activation+1.28e+00

Pos logit contributions
lf+0.049
 airplane+0.039
 cardboard+0.038
 parked+0.038
 then+0.038
Neg logit contributions
ust-0.088
irk-0.076
oker-0.074
 possible-0.071
 mind-0.071
Token he
Token ID339
Feature activation-9.33e-01

Pos logit contributions
lf+0.038
 Year+0.034
 airplane+0.031
 parked+0.031
cut+0.030
Neg logit contributions
ust-0.034
 forward-0.028
 mind-0.028
 possible-0.028
irk-0.028
Token started
Token ID2067
Feature activation+2.17e+00

Pos logit contributions
ust+0.030
 mind+0.023
 forward+0.021
irm+0.021
irk+0.021
Neg logit contributions
lf-0.023
 Year-0.023
 cardboard-0.020
 fridge-0.020
cut-0.020
Token to
Token ID284
Feature activation+1.49e+00

Pos logit contributions
lf+0.067
 airplane+0.056
 parked+0.056
 cardboard+0.055
 then+0.054
Neg logit contributions
ust-0.062
irk-0.050
oker-0.048
 forward-0.045
 possible-0.045
Token around
Token ID1088
Feature activation-2.68e-01

Pos logit contributions
ust+0.046
irk+0.041
oker+0.040
olding+0.038
osa+0.038
Neg logit contributions
lf-0.032
 airplane-0.029
 cardboard-0.028
 then-0.028
 parked-0.027
Token the
Token ID262
Feature activation+1.97e+00

Pos logit contributions
ust+0.009
irk+0.007
oker+0.007
aring+0.007
olding+0.007
Neg logit contributions
lf-0.007
 airplane-0.006
 cardboard-0.006
 parked-0.006
 walls-0.006
Token door
Token ID3420
Feature activation+6.57e-01

Pos logit contributions
lf+0.040
 Year+0.031
 parked+0.030
 airplane+0.030
 then+0.029
Neg logit contributions
ust-0.071
 mind-0.065
irk-0.062
 forward-0.062
oker-0.060
Token.
Token ID13
Feature activation-1.39e+00

Pos logit contributions
lf+0.019
 airplane+0.016
 parked+0.016
 then+0.016
 cardboard+0.016
Neg logit contributions
ust-0.019
irk-0.016
oker-0.015
 mind-0.014
osa-0.014
Token Fr
Token ID1305
Feature activation-3.49e+00

Pos logit contributions
ust+0.042
irk+0.035
oker+0.034
 mind+0.033
 forward+0.033
Neg logit contributions
lf-0.039
 airplane-0.033
 parked-0.032
 Year-0.032
 cardboard-0.031
Tokenust
Token ID436
Feature activation-1.33e+01

Pos logit contributions
ust+0.015
irk-0.004
oker-0.008
 forward-0.010
 mind-0.010
Neg logit contributions
lf-0.190
 airplane-0.174
 parked-0.172
 Year-0.172
 cardboard-0.171
Tokenrated
Token ID4111
Feature activation+2.31e+00

Pos logit contributions
ust+0.347
irk+0.293
oker+0.275
 forward+0.265
 mind+0.259
Neg logit contributions
lf-0.413
 airplane-0.355
 parked-0.351
 then-0.346
 set-0.341
Token,
Token ID11
Feature activation+1.33e+00

Pos logit contributions
lf+0.044
 then+0.035
 airplane+0.035
 walls+0.033
 set+0.033
Neg logit contributions
ust-0.092
irk-0.080
oker-0.077
 possible-0.075
olding-0.074
Token Jack
Token ID3619
Feature activation+4.66e-01

Pos logit contributions
lf+0.034
 airplane+0.027
 then+0.027
 parked+0.027
 cardboard+0.027
Neg logit contributions
ust-0.045
irk-0.038
oker-0.036
aring-0.035
 forward-0.035
Token put
Token ID1234
Feature activation+2.25e+00

Pos logit contributions
lf+0.010
 Year+0.008
 cardboard+0.008
 airplane+0.008
 fridge+0.007
Neg logit contributions
ust-0.017
irk-0.014
 forward-0.014
 mind-0.014
aring-0.013
Token the
Token ID262
Feature activation+2.03e+00

Pos logit contributions
lf+0.058
 airplane+0.047
 parked+0.046
 Year+0.046
 cardboard+0.045
Neg logit contributions
ust-0.075
irk-0.063
oker-0.060
 forward-0.059
 mind-0.059
Tokenge
Token ID469
Feature activation+5.93e-01

Pos logit contributions
ust+0.104
irk+0.087
oker+0.083
 forward+0.077
 mind+0.077
Neg logit contributions
lf-0.136
 parked-0.118
 airplane-0.117
 cardboard-0.115
 then-0.113
Token.
Token ID13
Feature activation-1.45e+00

Pos logit contributions
lf+0.014
 airplane+0.012
 cardboard+0.012
 then+0.011
 walls+0.011
Neg logit contributions
ust-0.020
irk-0.017
oker-0.017
rys-0.016
ashes-0.016
Token
Token ID198
Feature activation-2.02e+00

Pos logit contributions
ust+0.040
irk+0.033
 mind+0.031
 forward+0.031
oker+0.030
Neg logit contributions
lf-0.045
 airplane-0.039
 parked-0.038
 Year-0.037
 cardboard-0.036
Token
Token ID198
Feature activation-2.05e+00

Pos logit contributions
ust+0.059
irk+0.049
oker+0.046
 forward+0.045
 mind+0.045
Neg logit contributions
lf-0.058
 airplane-0.051
 Year-0.049
 parked-0.049
 cardboard-0.048
TokenFr
Token ID6732
Feature activation-5.06e+00

Pos logit contributions
ust+0.074
umpy+0.058
irk+0.058
aring+0.058
umble+0.058
Neg logit contributions
lf-0.046
 airplane-0.043
 Year-0.039
 cardboard-0.039
 parked-0.038
Tokenust
Token ID436
Feature activation-1.32e+01

Pos logit contributions
ust+0.018
irk-0.000
 forward-0.005
oker-0.006
 mind-0.007
Neg logit contributions
lf-0.270
 airplane-0.243
 parked-0.241
 cardboard-0.239
 Year-0.238
Tokenrated
Token ID4111
Feature activation+2.41e+00

Pos logit contributions
ust+0.331
irk+0.262
oker+0.247
 forward+0.240
 mind+0.238
Neg logit contributions
lf-0.444
 airplane-0.381
 parked-0.377
 Year-0.374
 cardboard-0.372
Token,
Token ID11
Feature activation+1.30e+00

Pos logit contributions
lf+0.044
 then+0.035
 airplane+0.034
 walls+0.033
 cardboard+0.033
Neg logit contributions
ust-0.098
irk-0.084
oker-0.083
 possible-0.081
olding-0.079
Token the
Token ID262
Feature activation+1.99e+00

Pos logit contributions
lf+0.035
 airplane+0.029
 cardboard+0.028
 parked+0.028
 Year+0.028
Neg logit contributions
ust-0.042
irk-0.035
oker-0.033
 forward-0.032
 possible-0.032
Token little
Token ID1310
Feature activation+3.82e+00

Pos logit contributions
lf+0.067
 Year+0.063
 parked+0.058
 airplane+0.057
ons+0.056
Neg logit contributions
ust-0.048
 mind-0.039
irk-0.036
 loser-0.036
oker-0.035
Token ant
Token ID1885
Feature activation+1.82e+00

Pos logit contributions
lf+0.113
 Year+0.094
 airplane+0.093
 parked+0.093
 cardboard+0.091
Neg logit contributions
ust-0.111
irk-0.092
oker-0.087
 mind-0.086
 forward-0.085
Token print
Token ID3601
Feature activation+2.12e+00

Pos logit contributions
ust+0.096
irk+0.077
oker+0.075
osa+0.072
umble+0.071
Neg logit contributions
lf-0.073
 airplane-0.069
 parked-0.069
 then-0.068
 set-0.068
Token so
Token ID523
Feature activation-1.33e+00

Pos logit contributions
lf+0.043
 set+0.043
 then+0.042
 walls+0.042
 airplane+0.040
Neg logit contributions
ust-0.081
irk-0.067
oker-0.066
aring-0.065
rys-0.063
Token badly
Token ID11234
Feature activation-3.65e+00

Pos logit contributions
ust+0.043
irk+0.038
oker+0.035
ashes+0.034
aring+0.034
Neg logit contributions
lf-0.031
 then-0.031
 airplane-0.026
 parked-0.025
 walls-0.025
Token.
Token ID13
Feature activation-1.69e+00

Pos logit contributions
ust+0.153
irk+0.134
oker+0.129
rys+0.126
 Deal+0.125
Neg logit contributions
 then-0.061
lf-0.059
 store-0.051
 airplane-0.051
 set-0.050
Token Fr
Token ID1305
Feature activation-3.40e+00

Pos logit contributions
ust+0.055
irk+0.046
oker+0.045
 possible+0.044
 forward+0.042
Neg logit contributions
lf-0.043
 airplane-0.036
 parked-0.036
 Year-0.035
 then-0.035
Tokenust
Token ID436
Feature activation-1.32e+01

Pos logit contributions
ust+0.033
irk+0.018
oker+0.014
 forward+0.013
 mind+0.013
Neg logit contributions
lf-0.164
 airplane-0.147
 parked-0.145
 Year-0.145
 cardboard-0.143
Tokenrated
Token ID4111
Feature activation+2.20e+00

Pos logit contributions
ust+0.354
irk+0.302
oker+0.283
 forward+0.272
 mind+0.268
Neg logit contributions
lf-0.400
 airplane-0.343
 parked-0.343
 then-0.337
 set-0.333
Token tears
Token ID10953
Feature activation-6.38e+00

Pos logit contributions
lf+0.038
 then+0.032
 airplane+0.029
 set+0.029
 walls+0.028
Neg logit contributions
ust-0.092
irk-0.079
oker-0.077
 possible-0.075
ashes-0.073
Token were
Token ID547
Feature activation+9.55e-01

Pos logit contributions
ust+0.230
irk+0.197
oker+0.191
 possible+0.185
 mind+0.180
Neg logit contributions
lf-0.146
 parked-0.118
 airplane-0.117
 then-0.117
 set-0.114
Token well
Token ID880
Feature activation-5.84e+00

Pos logit contributions
lf+0.024
 then+0.022
 parked+0.021
 set+0.020
 airplane+0.020
Neg logit contributions
ust-0.032
oker-0.028
irk-0.027
ashes-0.025
rav-0.025
Tokening
Token ID278
Feature activation-3.16e+00

Pos logit contributions
ust+0.255
irk+0.223
oker+0.218
 Deal+0.209
rav+0.208
Neg logit contributions
lf-0.091
 then-0.083
 set-0.070
 airplane-0.070
 parked-0.068
Token was
Token ID373
Feature activation+1.91e+00

Pos logit contributions
lf+0.085
 Year+0.066
 airplane+0.062
 cardboard+0.058
 fridge+0.058
Neg logit contributions
ust-0.183
irk-0.153
 mind-0.152
 possible-0.150
oker-0.149
Token too
Token ID1165
Feature activation-8.73e+00

Pos logit contributions
lf+0.052
 airplane+0.043
 parked+0.043
 cardboard+0.042
 Year+0.042
Neg logit contributions
ust-0.060
irk-0.050
oker-0.048
 forward-0.046
 mind-0.046
Token strong
Token ID1913
Feature activation+3.82e-01

Pos logit contributions
ust+0.243
irk+0.198
oker+0.189
 forward+0.183
 mind+0.181
Neg logit contributions
lf-0.268
 airplane-0.227
 parked-0.226
 cardboard-0.223
 Year-0.221
Token.
Token ID13
Feature activation-1.48e+00

Pos logit contributions
lf+0.008
 airplane+0.007
 then+0.007
 parked+0.007
 set+0.007
Neg logit contributions
ust-0.014
irk-0.012
oker-0.011
rys-0.011
osa-0.011
Token Fr
Token ID1305
Feature activation-3.36e+00

Pos logit contributions
ust+0.045
irk+0.038
 mind+0.037
oker+0.036
 forward+0.036
Neg logit contributions
lf-0.041
 airplane-0.034
 Year-0.033
 parked-0.033
 cardboard-0.032
Tokenust
Token ID436
Feature activation-1.32e+01

Pos logit contributions
ust+0.035
irk+0.015
oker+0.012
 forward+0.010
 mind+0.009
Neg logit contributions
lf-0.163
 airplane-0.148
 parked-0.146
 Year-0.146
 cardboard-0.146
Tokenrated
Token ID4111
Feature activation+2.29e+00

Pos logit contributions
ust+0.320
irk+0.236
 possible+0.226
oker+0.226
 mind+0.225
Neg logit contributions
lf-0.464
 Year-0.399
 airplane-0.394
 cardboard-0.391
 parked-0.387
Token,
Token ID11
Feature activation+1.18e+00

Pos logit contributions
lf+0.047
 airplane+0.038
 then+0.037
 cardboard+0.036
 parked+0.035
Neg logit contributions
ust-0.088
irk-0.076
oker-0.074
 possible-0.071
olding-0.070
Token he
Token ID339
Feature activation-9.26e-01

Pos logit contributions
lf+0.029
 airplane+0.024
 Year+0.023
 parked+0.023
 cardboard+0.023
Neg logit contributions
ust-0.040
irk-0.033
oker-0.032
 forward-0.032
 mind-0.032
Token started
Token ID2067
Feature activation+2.12e+00

Pos logit contributions
ust+0.031
 mind+0.025
 forward+0.024
irk+0.023
aring+0.023
Neg logit contributions
lf-0.023
 Year-0.020
 fridge-0.019
 cardboard-0.019
 couch-0.018
Token to
Token ID284
Feature activation+1.48e+00

Pos logit contributions
lf+0.059
 airplane+0.051
 parked+0.050
 set+0.049
 then+0.049
Neg logit contributions
ust-0.067
irk-0.054
oker-0.054
 loser-0.049
 possible-0.049
Token he
Token ID339
Feature activation-7.60e-01

Pos logit contributions
lf+0.026
 parked+0.021
 airplane+0.021
 then+0.021
 cardboard+0.020
Neg logit contributions
ust-0.037
irk-0.032
oker-0.030
 forward-0.029
 possible-0.029
Token said
Token ID531
Feature activation+2.68e+00

Pos logit contributions
ust+0.030
irk+0.026
oker+0.025
 forward+0.025
 mind+0.025
Neg logit contributions
lf-0.014
 airplane-0.011
 Year-0.010
 parked-0.010
 cardboard-0.010
Token no
Token ID645
Feature activation-1.61e+00

Pos logit contributions
lf+0.073
 airplane+0.060
 parked+0.059
 then+0.058
 cardboard+0.058
Neg logit contributions
ust-0.085
irk-0.071
oker-0.068
 possible-0.065
 forward-0.065
Token.
Token ID13
Feature activation-1.40e+00

Pos logit contributions
ust+0.053
oker+0.046
irk+0.045
 forward+0.044
rys+0.042
Neg logit contributions
lf-0.040
 then-0.034
 airplane-0.034
 engine-0.033
 parked-0.033
Token Fr
Token ID1305
Feature activation-3.15e+00

Pos logit contributions
ust+0.049
irk+0.041
oker+0.040
 possible+0.039
 forward+0.039
Neg logit contributions
lf-0.033
 parked-0.027
 then-0.027
 Year-0.027
 airplane-0.027
Tokenust
Token ID436
Feature activation-1.30e+01

Pos logit contributions
ust+0.021
irk+0.011
 forward+0.009
 mind+0.009
oker+0.007
Neg logit contributions
lf-0.157
 airplane-0.139
 parked-0.139
 Year-0.136
 then-0.136
Tokenrated
Token ID4111
Feature activation+2.45e+00

Pos logit contributions
ust+0.351
irk+0.305
oker+0.286
 forward+0.273
 mind+0.265
Neg logit contributions
lf-0.386
 airplane-0.331
 then-0.330
 parked-0.330
 set-0.325
Token,
Token ID11
Feature activation+1.40e+00

Pos logit contributions
lf+0.046
 then+0.037
 airplane+0.036
 cardboard+0.034
 set+0.034
Neg logit contributions
ust-0.099
irk-0.086
oker-0.083
 possible-0.081
ashes-0.079
Token Bob
Token ID5811
Feature activation-4.52e-01

Pos logit contributions
lf+0.033
 then+0.030
 parked+0.027
 set+0.026
 lake+0.025
Neg logit contributions
ust-0.051
irk-0.046
oker-0.043
aring-0.042
olding-0.042
Token decided
Token ID3066
Feature activation+7.35e+00

Pos logit contributions
ust+0.017
irk+0.015
oker+0.014
 possible+0.014
 forward+0.014
Neg logit contributions
lf-0.009
 airplane-0.007
 parked-0.007
 Year-0.007
 cardboard-0.007
Token to
Token ID284
Feature activation+1.61e+00

Pos logit contributions
lf+0.224
 airplane+0.189
 parked+0.188
 Year+0.183
 cardboard+0.182
Neg logit contributions
ust-0.209
irk-0.170
oker-0.163
 forward-0.156
 mind-0.155
Token.
Token ID13
Feature activation-2.00e+00

Pos logit contributions
ust+0.061
irk+0.054
oker+0.050
 forward+0.049
rys+0.048
Neg logit contributions
lf-0.040
 airplane-0.032
 then-0.032
 parked-0.032
 set-0.032
Token Can
Token ID1680
Feature activation+5.29e+00

Pos logit contributions
ust+0.062
irk+0.052
oker+0.050
 possible+0.048
 forward+0.047
Neg logit contributions
lf-0.052
 Year-0.046
 parked-0.045
 then-0.045
 cardboard-0.045
Token you
Token ID345
Feature activation-8.89e+00

Pos logit contributions
lf+0.153
 airplane+0.127
 parked+0.126
 Year+0.125
 cardboard+0.124
Neg logit contributions
ust-0.157
irk-0.130
oker-0.124
 forward-0.121
 mind-0.120
Token say
Token ID910
Feature activation+3.23e-01

Pos logit contributions
ust+0.280
irk+0.228
 mind+0.225
oker+0.220
 forward+0.220
Neg logit contributions
lf-0.240
 airplane-0.196
 Year-0.195
 parked-0.190
 cardboard-0.189
Token five
Token ID1936
Feature activation-2.52e+00

Pos logit contributions
lf+0.008
 airplane+0.007
 then+0.007
 parked+0.007
 Year+0.007
Neg logit contributions
ust-0.011
irk-0.009
oker-0.009
umpy-0.008
 possible-0.008
Token kil
Token ID8769
Feature activation-1.29e+01

Pos logit contributions
ust+0.081
irk+0.070
oker+0.065
 forward+0.063
umble+0.062
Neg logit contributions
lf-0.068
 airplane-0.056
 then-0.055
 Year-0.054
 cardboard-0.053
Tokenos
Token ID418
Feature activation-1.11e+00

Pos logit contributions
ust+0.344
irk+0.277
oker+0.266
 mind+0.263
 possible+0.256
Neg logit contributions
lf-0.395
 parked-0.340
 airplane-0.331
 Year-0.327
 then-0.324
Token?"
Token ID1701
Feature activation+4.38e+00

Pos logit contributions
ust+0.045
irk+0.039
oker+0.038
 mind+0.037
 possible+0.037
Neg logit contributions
lf-0.020
 airplane-0.015
 parked-0.015
 Year-0.014
 then-0.014
Token uncle
Token ID7711
Feature activation+1.21e+00

Pos logit contributions
lf+0.103
 then+0.097
 store+0.089
 parked+0.084
 airplane+0.082
Neg logit contributions
ust-0.165
beh-0.129
oker-0.129
aring-0.129
ashing-0.129
Token Tom
Token ID4186
Feature activation+1.08e+00

Pos logit contributions
lf+0.034
 parked+0.029
 then+0.028
 set+0.026
 airplane+0.026
Neg logit contributions
ust-0.039
oker-0.032
irk-0.032
 possible-0.031
rys-0.030
Token said
Token ID531
Feature activation+1.99e+00

Pos logit contributions
lf+0.018
 parked+0.013
 then+0.013
 set+0.012
 airplane+0.012
Neg logit contributions
ust-0.046
oker-0.041
 possible-0.040
irk-0.039
 forward-0.038
Token
Token ID198
Feature activation-2.30e+00

Pos logit contributions
ust+0.061
irk+0.051
 forward+0.046
oker+0.046
 possible+0.046
Neg logit contributions
lf-0.070
 airplane-0.063
 Year-0.060
 parked-0.059
 cardboard-0.059
Token"
Token ID1
Feature activation+2.83e+00

Pos logit contributions
ust+0.083
irk+0.070
oker+0.068
 forward+0.067
 mind+0.067
Neg logit contributions
lf-0.052
 airplane-0.041
 parked-0.041
 Year-0.040
 cardboard-0.040
TokenThey
Token ID2990
Feature activation+3.51e-01

Pos logit contributions
lf+0.057
 parked+0.048
 then+0.047
 store+0.046
 Year+0.045
Neg logit contributions
ust-0.102
irk-0.089
 forward-0.088
oker-0.088
 mind-0.086
Token weigh
Token ID10164
Feature activation-1.36e+00

Pos logit contributions
lf+0.010
 airplane+0.009
 parked+0.009
 cardboard+0.009
 Year+0.009
Neg logit contributions
ust-0.010
irk-0.008
oker-0.008
 forward-0.008
 possible-0.007
Token five
Token ID1936
Feature activation-2.53e+00

Pos logit contributions
ust+0.048
irk+0.043
oker+0.041
rys+0.039
ashing+0.038
Neg logit contributions
lf-0.031
 then-0.027
 walls-0.025
 parked-0.025
 airplane-0.025
Token kil
Token ID8769
Feature activation-1.29e+01

Pos logit contributions
ust+0.073
irk+0.062
oker+0.057
 forward+0.055
umble+0.053
Neg logit contributions
lf-0.078
 airplane-0.065
 cardboard-0.064
 walls-0.063
 then-0.063
Tokenos
Token ID418
Feature activation-9.82e-01

Pos logit contributions
ust+0.343
irk+0.276
oker+0.261
 mind+0.261
 possible+0.252
Neg logit contributions
lf-0.400
 parked-0.345
 airplane-0.335
 Year-0.332
 cardboard-0.328
Token.
Token ID13
Feature activation-2.00e+00

Pos logit contributions
ust+0.034
irk+0.030
oker+0.027
 mind+0.027
 possible+0.027
Neg logit contributions
lf-0.024
 airplane-0.020
 parked-0.019
 then-0.019
 set-0.019
Token That
Token ID1320
Feature activation-1.97e+00

Pos logit contributions
ust+0.064
irk+0.054
oker+0.051
 possible+0.050
 forward+0.049
Neg logit contributions
lf-0.049
 Year-0.045
 parked-0.044
 then-0.043
 cardboard-0.043
Token means
Token ID1724
Feature activation+2.72e+00

Pos logit contributions
ust+0.073
irk+0.064
oker+0.061
 forward+0.060
 mind+0.059
Neg logit contributions
lf-0.043
 airplane-0.034
 parked-0.033
 cardboard-0.033
 Year-0.033
Token five
Token ID1936
Feature activation-2.53e+00

Pos logit contributions
lf+0.061
 then+0.052
 cardboard+0.048
 walls+0.048
 airplane+0.047
Neg logit contributions
ust-0.102
irk-0.096
umpy-0.086
ashing-0.085
oker-0.085
Token.
Token ID13
Feature activation-1.64e+00

Pos logit contributions
lf+0.041
 then+0.037
 airplane+0.034
 set+0.033
 walls+0.033
Neg logit contributions
ust-0.073
oker-0.062
 common-0.060
irk-0.060
rys-0.059
Token
Token ID198
Feature activation-2.18e+00

Pos logit contributions
ust+0.076
beh+0.067
oker+0.065
 possible+0.064
 unnecessary+0.063
Neg logit contributions
lf-0.022
 then-0.019
 store-0.017
 parked-0.016
 set-0.015
Token
Token ID198
Feature activation-2.21e+00

Pos logit contributions
ust+0.077
irk+0.064
oker+0.063
 possible+0.061
 forward+0.061
Neg logit contributions
lf-0.051
 parked-0.040
 then-0.040
 Year-0.039
 cardboard-0.039
Token"
Token ID1
Feature activation+2.89e+00

Pos logit contributions
ust+0.080
irk+0.068
rys+0.068
 fault+0.067
oker+0.066
Neg logit contributions
lf-0.053
 then-0.042
 parked-0.040
 store-0.039
 Year-0.037
TokenFive
Token ID20029
Feature activation-3.59e+00

Pos logit contributions
lf+0.066
 then+0.059
 parked+0.057
 set+0.055
 store+0.055
Neg logit contributions
ust-0.097
irk-0.083
oker-0.080
 fault-0.077
 forward-0.077
Token kil
Token ID8769
Feature activation-1.28e+01

Pos logit contributions
ust+0.112
irk+0.093
oker+0.086
ashing+0.081
 forward+0.081
Neg logit contributions
lf-0.102
 then-0.088
 airplane-0.085
 cardboard-0.085
 set-0.085
Tokenos
Token ID418
Feature activation-1.10e+00

Pos logit contributions
ust+0.346
irk+0.278
oker+0.265
 mind+0.262
 possible+0.259
Neg logit contributions
lf-0.392
 parked-0.338
 airplane-0.329
 Year-0.328
 then-0.321
Token,"
Token ID553
Feature activation+6.54e+00

Pos logit contributions
ust+0.039
irk+0.033
oker+0.031
 possible+0.030
 mind+0.030
Neg logit contributions
lf-0.026
 airplane-0.022
 parked-0.021
 then-0.021
 set-0.021
Token Anna
Token ID11735
Feature activation-2.28e+00

Pos logit contributions
lf+0.188
 then+0.177
 store+0.165
 walls+0.147
 set+0.147
Neg logit contributions
ust-0.227
ashing-0.190
beh-0.189
aring-0.181
ashes-0.174
Token repeated
Token ID5100
Feature activation+2.51e+00

Pos logit contributions
ust+0.104
 possible+0.091
oker+0.091
 unnecessary+0.090
 forward+0.089
Neg logit contributions
lf-0.030
 then-0.025
 set-0.024
 parked-0.021
 airplane-0.019
Token.
Token ID13
Feature activation-1.74e+00

Pos logit contributions
lf+0.058
 then+0.053
 airplane+0.048
 walls+0.048
 set+0.046
Neg logit contributions
ust-0.096
irk-0.080
oker-0.078
umpy-0.077
rys-0.073
Token the
Token ID262
Feature activation+2.01e+00

Pos logit contributions
lf+0.036
 airplane+0.029
 parked+0.028
 cardboard+0.028
 Year+0.028
Neg logit contributions
ust-0.053
irk-0.045
oker-0.043
 forward-0.042
 mind-0.042
Token bald
Token ID28273
Feature activation+1.83e+00

Pos logit contributions
lf+0.052
 Year+0.044
 airplane+0.042
 parked+0.040
 then+0.039
Neg logit contributions
ust-0.064
 mind-0.056
 loser-0.053
irk-0.053
 possible-0.052
Token monkey
Token ID21657
Feature activation+3.36e+00

Pos logit contributions
lf+0.039
 Year+0.032
 airplane+0.031
 parked+0.031
 cardboard+0.030
Neg logit contributions
ust-0.065
irk-0.057
 mind-0.055
oker-0.054
 loser-0.053
Token weighed
Token ID20261
Feature activation+1.35e+00

Pos logit contributions
lf+0.058
 Year+0.045
 airplane+0.043
 cardboard+0.041
 parked+0.041
Neg logit contributions
ust-0.136
irk-0.115
 mind-0.114
oker-0.112
 possible-0.112
Token 4
Token ID604
Feature activation-5.67e-01

Pos logit contributions
lf+0.034
 then+0.029
 parked+0.028
 airplane+0.028
 walls+0.028
Neg logit contributions
ust-0.044
irk-0.039
oker-0.037
rys-0.035
aring-0.034
Token kil
Token ID8769
Feature activation-1.27e+01

Pos logit contributions
ust+0.016
irk+0.013
oker+0.012
 forward+0.012
 possible+0.012
Neg logit contributions
lf-0.018
 airplane-0.015
 parked-0.014
 cardboard-0.014
 Year-0.014
Tokenos
Token ID418
Feature activation-6.64e-01

Pos logit contributions
ust+0.323
irk+0.259
oker+0.244
 mind+0.241
 forward+0.235
Neg logit contributions
lf-0.412
 parked-0.352
 airplane-0.348
 Year-0.341
 cardboard-0.341
Token.
Token ID13
Feature activation-1.42e+00

Pos logit contributions
ust+0.021
irk+0.018
oker+0.017
 mind+0.016
osa+0.016
Neg logit contributions
lf-0.018
 airplane-0.015
 parked-0.015
 then-0.015
 cardboard-0.015
Token The
Token ID383
Feature activation+5.50e+00

Pos logit contributions
ust+0.039
irk+0.032
 mind+0.032
oker+0.031
 forward+0.031
Neg logit contributions
lf-0.043
 airplane-0.036
 Year-0.035
 parked-0.034
 cardboard-0.034
Token monkey
Token ID21657
Feature activation+3.39e+00

Pos logit contributions
lf+0.145
 Year+0.124
 airplane+0.116
 parked+0.113
bed+0.108
Neg logit contributions
ust-0.168
 mind-0.151
irk-0.146
 loser-0.145
oker-0.140
Token was
Token ID373
Feature activation+1.96e+00

Pos logit contributions
lf+0.067
 Year+0.063
 airplane+0.058
 cardboard+0.056
 fridge+0.056
Neg logit contributions
ust-0.119
 mind-0.100
 than-0.094
irk-0.094
aring-0.093
Token,
Token ID11
Feature activation+1.11e+00

Pos logit contributions
lf+0.061
 then+0.049
 airplane+0.049
 set+0.048
 walls+0.046
Neg logit contributions
ust-0.129
irk-0.110
oker-0.107
rys-0.104
 unnecessary-0.102
Token being
Token ID852
Feature activation-7.17e+00

Pos logit contributions
lf+0.028
 then+0.024
 parked+0.024
 airplane+0.024
 set+0.023
Neg logit contributions
ust-0.038
irk-0.032
oker-0.031
umpy-0.029
 possible-0.029
Token surrounded
Token ID11191
Feature activation-7.94e-01

Pos logit contributions
ust+0.252
irk+0.225
oker+0.211
umble+0.206
olding+0.206
Neg logit contributions
lf-0.158
 parked-0.154
 then-0.154
 set-0.152
 airplane-0.144
Token by
Token ID416
Feature activation-5.39e+00

Pos logit contributions
ust+0.025
irk+0.021
oker+0.021
ashes+0.020
 loser+0.020
Neg logit contributions
lf-0.020
 set-0.017
 airplane-0.017
 then-0.017
 parked-0.016
Token the
Token ID262
Feature activation+1.90e+00

Pos logit contributions
ust+0.200
irk+0.174
oker+0.170
 forward+0.160
olding+0.160
Neg logit contributions
lf-0.116
 airplane-0.095
 cardboard-0.093
 walls-0.092
 parked-0.091
Token tranqu
Token ID31882
Feature activation-1.26e+01

Pos logit contributions
lf+0.039
 walls+0.035
 airplane+0.035
 cardboard+0.034
 set+0.034
Neg logit contributions
ust-0.070
irk-0.061
oker-0.059
aring-0.057
umble-0.056
Tokenility
Token ID879
Feature activation-2.95e+00

Pos logit contributions
ust+0.309
irk+0.256
oker+0.243
 mind+0.236
 loser+0.231
Neg logit contributions
lf-0.401
 parked-0.352
 Year-0.343
 airplane-0.337
 then-0.332
Token of
Token ID286
Feature activation-3.90e+00

Pos logit contributions
ust+0.097
irk+0.087
 unnecessary+0.078
oker+0.077
rys+0.076
Neg logit contributions
lf-0.077
 set-0.065
 airplane-0.064
 then-0.064
 walls-0.061
Token nature
Token ID3450
Feature activation-2.59e+00

Pos logit contributions
ust+0.140
irk+0.118
oker+0.115
aring+0.111
olding+0.110
Neg logit contributions
lf-0.092
 airplane-0.074
 cardboard-0.074
 parked-0.073
 then-0.072
Token.
Token ID13
Feature activation-1.83e+00

Pos logit contributions
ust+0.094
irk+0.085
oker+0.078
 loser+0.078
olding+0.075
Neg logit contributions
lf-0.060
 then-0.051
 airplane-0.051
 set-0.050
 walls-0.048
Token
Token ID220
Feature activation-1.64e+00

Pos logit contributions
ust+0.076
oker+0.065
 loser+0.063
irk+0.063
beh+0.063
Neg logit contributions
lf-0.027
 then-0.025
 parked-0.024
 lake-0.022
 fields-0.021
Token,
Token ID11
Feature activation+9.16e-01

Pos logit contributions
ust+0.212
irk+0.193
oker+0.181
rys+0.174
 possible+0.169
Neg logit contributions
lf-0.120
 airplane-0.094
 parked-0.092
 then-0.091
 Year-0.090
Token we
Token ID356
Feature activation-5.63e+00

Pos logit contributions
lf+0.025
 airplane+0.022
 then+0.022
 parked+0.022
 store+0.021
Neg logit contributions
ust-0.028
irk-0.024
umble-0.023
oker-0.023
beh-0.022
Token are
Token ID389
Feature activation-1.44e+00

Pos logit contributions
ust+0.165
irk+0.138
oker+0.129
 possible+0.124
 forward+0.124
Neg logit contributions
lf-0.164
 parked-0.140
 airplane-0.139
 set-0.138
 cardboard-0.136
Token happy
Token ID3772
Feature activation+1.61e+00

Pos logit contributions
ust+0.046
irk+0.042
oker+0.038
umble+0.037
olding+0.036
Neg logit contributions
lf-0.040
 then-0.035
 airplane-0.035
 parked-0.034
 set-0.034
Token as
Token ID355
Feature activation-4.10e+00

Pos logit contributions
lf+0.041
 then+0.034
 airplane+0.034
 Year+0.034
 set+0.032
Neg logit contributions
ust-0.056
irk-0.046
oker-0.044
 forward-0.042
umble-0.042
Token can
Token ID460
Feature activation+6.66e+00

Pos logit contributions
ust+0.139
irk+0.124
umpy+0.109
oker+0.108
aring+0.108
Neg logit contributions
lf-0.101
 cardboard-0.089
 set-0.088
 then-0.087
 airplane-0.084
Token be
Token ID307
Feature activation-4.40e+00

Pos logit contributions
lf+0.162
 set+0.145
 then+0.137
 airplane+0.136
 parked+0.135
Neg logit contributions
ust-0.232
irk-0.201
oker-0.185
 unnecessary-0.178
 forward-0.175
Token."
Token ID526
Feature activation+1.50e+00

Pos logit contributions
ust+0.142
irk+0.133
oker+0.116
umble+0.114
olding+0.112
Neg logit contributions
lf-0.112
 set-0.108
 airplane-0.106
 then-0.100
 parked-0.099
Token
Token ID198
Feature activation-1.97e+00

Pos logit contributions
lf+0.032
 then+0.026
 cardboard+0.026
 Year+0.025
 parked+0.025
Neg logit contributions
ust-0.056
irk-0.045
oker-0.045
 possible-0.045
 forward-0.044
Token
Token ID198
Feature activation-1.99e+00

Pos logit contributions
ust+0.041
irk+0.036
apping+0.032
 mind+0.031
 common+0.030
Neg logit contributions
lf-0.069
 airplane-0.063
 Year-0.060
 parked-0.058
 cardboard-0.057
TokenThen
Token ID6423
Feature activation+4.90e+00

Pos logit contributions
ust+0.052
irk+0.041
oker+0.039
 forward+0.039
 loser+0.038
Neg logit contributions
lf-0.063
 airplane-0.056
 Year-0.054
 fridge-0.053
 cardboard-0.052
Token saw
Token ID2497
Feature activation+6.78e+00

Pos logit contributions
cut+0.018
lf+0.018
 cardboard+0.018
 Year+0.017
bridge+0.017
Neg logit contributions
ust-0.016
 mind-0.013
ashing-0.012
umpy-0.011
osa-0.011
Token a
Token ID257
Feature activation+3.00e+00

Pos logit contributions
lf+0.241
 Year+0.215
cut+0.206
stack+0.188
 blasted+0.188
Neg logit contributions
ust-0.130
 possible-0.105
 much-0.103
 mind-0.103
 good-0.103
Token big
Token ID1263
Feature activation+2.23e+00

Pos logit contributions
lf+0.081
 Year+0.067
 airplane+0.066
 parked+0.066
 cardboard+0.064
Neg logit contributions
ust-0.092
irk-0.077
oker-0.074
 mind-0.073
 forward-0.072
Token dog
Token ID3290
Feature activation+2.28e+00

Pos logit contributions
lf+0.045
 Year+0.034
 parked+0.034
 airplane+0.034
 cardboard+0.032
Neg logit contributions
ust-0.083
irk-0.073
oker-0.070
 mind-0.070
 forward-0.069
Token and
Token ID290
Feature activation+1.48e+00

Pos logit contributions
lf+0.068
 Year+0.060
 airplane+0.057
 walls+0.056
 cardboard+0.055
Neg logit contributions
ust-0.060
irk-0.050
 forward-0.049
aring-0.048
 mind-0.047
Token wanted
Token ID2227
Feature activation+5.00e+00

Pos logit contributions
lf+0.042
 Year+0.038
 airplane+0.033
cut+0.032
 cardboard+0.032
Neg logit contributions
ust-0.041
 mind-0.035
irk-0.033
 possible-0.032
 common-0.032
Token to
Token ID284
Feature activation+1.24e+00

Pos logit contributions
lf+0.127
 then+0.114
 parked+0.114
 airplane+0.111
 set+0.111
Neg logit contributions
ust-0.171
irk-0.142
oker-0.141
apping-0.131
umpy-0.129
Token pet
Token ID4273
Feature activation-1.56e+00

Pos logit contributions
lf+0.031
 parked+0.025
 airplane+0.025
 set+0.025
 cardboard+0.025
Neg logit contributions
ust-0.042
irk-0.035
oker-0.034
 forward-0.033
 possible-0.033
Token it
Token ID340
Feature activation-1.05e+00

Pos logit contributions
ust+0.061
irk+0.053
oker+0.051
 possible+0.050
 mind+0.050
Neg logit contributions
lf-0.030
 airplane-0.023
 parked-0.023
 Year-0.022
 cardboard-0.022
Token,
Token ID11
Feature activation+1.03e+00

Pos logit contributions
ust+0.032
irk+0.028
oker+0.027
rys+0.025
umble+0.025
Neg logit contributions
lf-0.028
 set-0.025
 airplane-0.025
 then-0.025
 parked-0.024
Token but
Token ID475
Feature activation+7.71e-01

Pos logit contributions
lf+0.031
 airplane+0.026
 parked+0.026
 cardboard+0.025
 Year+0.025
Neg logit contributions
ust-0.029
irk-0.024
oker-0.023
 forward-0.022
 possible-0.022
Token opened
Token ID4721
Feature activation+5.28e+00

Pos logit contributions
ust+0.051
 mind+0.044
irk+0.042
 forward+0.042
olding+0.039
Neg logit contributions
lf-0.036
 Year-0.029
 airplane-0.027
cut-0.026
 cardboard-0.026
Token the
Token ID262
Feature activation+1.69e+00

Pos logit contributions
lf+0.140
 parked+0.111
 Year+0.110
 airplane+0.109
cut+0.106
Neg logit contributions
ust-0.164
irk-0.138
 mind-0.135
 forward-0.135
 possible-0.134
Token box
Token ID3091
Feature activation+2.11e+00

Pos logit contributions
lf+0.035
 Year+0.027
 parked+0.025
cut+0.025
 then+0.025
Neg logit contributions
ust-0.062
 mind-0.056
 forward-0.055
irk-0.055
 attention-0.052
Token.
Token ID13
Feature activation-1.52e+00

Pos logit contributions
lf+0.054
 Year+0.044
 airplane+0.043
 parked+0.042
 cardboard+0.041
Neg logit contributions
ust-0.069
irk-0.059
 forward-0.056
oker-0.056
 mind-0.055
Token They
Token ID1119
Feature activation+8.63e-01

Pos logit contributions
ust+0.040
irk+0.033
 mind+0.032
ashes+0.031
oker+0.031
Neg logit contributions
lf-0.049
 airplane-0.041
 Year-0.039
cut-0.038
 parked-0.037
Token found
Token ID1043
Feature activation+4.70e+00

Pos logit contributions
lf+0.023
 Year+0.022
cut+0.020
ulum+0.019
 airplane+0.019
Neg logit contributions
ust-0.024
 mind-0.022
bs-0.020
 forward-0.019
irk-0.019
Token many
Token ID867
Feature activation-2.78e+00

Pos logit contributions
lf+0.134
 Year+0.108
 airplane+0.102
 parked+0.101
cut+0.095
Neg logit contributions
ust-0.140
irk-0.120
 forward-0.118
 mind-0.116
 possible-0.116
Token toys
Token ID14958
Feature activation-1.09e+00

Pos logit contributions
ust+0.077
irk+0.063
 mind+0.062
 forward+0.060
oker+0.059
Neg logit contributions
lf-0.080
 Year-0.070
 parked-0.068
 airplane-0.068
cut-0.066
Token inside
Token ID2641
Feature activation-1.90e-02

Pos logit contributions
ust+0.037
irk+0.031
oker+0.030
 forward+0.030
 mind+0.029
Neg logit contributions
lf-0.026
 airplane-0.021
 parked-0.021
 cardboard-0.021
 Year-0.021
Token.
Token ID13
Feature activation-1.55e+00

Pos logit contributions
ust+0.001
irk+0.000
oker+0.000
 forward+0.000
 mind+0.000
Neg logit contributions
lf-0.001
 airplane-0.000
 parked-0.000
 cardboard-0.000
 then-0.000
Token There
Token ID1318
Feature activation+4.18e+00

Pos logit contributions
ust+0.040
irk+0.033
 mind+0.031
oker+0.031
 forward+0.030
Neg logit contributions
lf-0.051
 airplane-0.043
 Year-0.042
 parked-0.041
 cardboard-0.041
Token exploring
Token ID13504
Feature activation-6.10e-01

Pos logit contributions
ust+0.250
irk+0.208
oker+0.199
 forward+0.194
 mind+0.194
Neg logit contributions
lf-0.215
 airplane-0.177
 parked-0.174
 Year-0.173
 cardboard-0.172
Token,
Token ID11
Feature activation+1.10e+00

Pos logit contributions
ust+0.022
irk+0.019
oker+0.018
 forward+0.018
 possible+0.017
Neg logit contributions
lf-0.014
 airplane-0.011
 cardboard-0.011
 parked-0.011
 then-0.011
Token when
Token ID618
Feature activation+5.61e-01

Pos logit contributions
lf+0.033
 Year+0.028
 airplane+0.028
 cardboard+0.027
ons+0.026
Neg logit contributions
ust-0.031
 forward-0.025
 mind-0.025
irk-0.024
 possible-0.024
Token he
Token ID339
Feature activation-1.04e+00

Pos logit contributions
lf+0.015
 Year+0.013
 airplane+0.013
 parked+0.012
cut+0.012
Neg logit contributions
ust-0.016
 possible-0.014
 forward-0.014
 mind-0.013
irk-0.013
Token suddenly
Token ID6451
Feature activation+5.94e+00

Pos logit contributions
ust+0.032
 forward+0.026
irk+0.026
 mind+0.026
oker+0.024
Neg logit contributions
lf-0.027
cut-0.025
 Year-0.025
 airplane-0.024
 cardboard-0.023
Token heard
Token ID2982
Feature activation+5.60e+00

Pos logit contributions
lf+0.121
 parked+0.094
 airplane+0.093
 cardboard+0.090
 set+0.088
Neg logit contributions
ust-0.231
irk-0.198
oker-0.192
 forward-0.187
 possible-0.186
Token a
Token ID257
Feature activation+3.15e+00

Pos logit contributions
lf+0.146
 Year+0.120
 airplane+0.116
 parked+0.115
cut+0.112
Neg logit contributions
ust-0.174
irk-0.146
 mind-0.141
 possible-0.138
 forward-0.138
Token loud
Token ID7812
Feature activation+1.86e+00

Pos logit contributions
lf+0.098
 parked+0.084
 Year+0.084
 airplane+0.083
 cardboard+0.081
Neg logit contributions
ust-0.084
irk-0.068
 mind-0.063
oker-0.063
 forward-0.063
Token noise
Token ID7838
Feature activation-3.02e+00

Pos logit contributions
lf+0.048
 airplane+0.039
 parked+0.039
 Year+0.039
 cardboard+0.038
Neg logit contributions
ust-0.060
irk-0.050
oker-0.048
 mind-0.048
 forward-0.047
Token.
Token ID13
Feature activation-1.43e+00

Pos logit contributions
ust+0.085
irk+0.069
oker+0.066
 forward+0.064
 mind+0.064
Neg logit contributions
lf-0.092
 airplane-0.078
 parked-0.077
 Year-0.076
 cardboard-0.076
Token It
Token ID632
Feature activation+4.71e+00

Pos logit contributions
ust+0.043
irk+0.036
 mind+0.034
oker+0.034
 forward+0.034
Neg logit contributions
lf-0.040
 airplane-0.034
 Year-0.033
 cardboard-0.032
 parked-0.032
Token day
Token ID1110
Feature activation+1.92e+00

Pos logit contributions
ust+0.246
irk+0.210
umpy+0.195
 forward+0.195
oker+0.194
Neg logit contributions
lf-0.165
 airplane-0.137
 cardboard-0.131
 set-0.130
 then-0.125
Token ever
Token ID1683
Feature activation-7.30e+00

Pos logit contributions
lf+0.059
 then+0.054
 airplane+0.051
 set+0.051
ons+0.051
Neg logit contributions
ust-0.055
irk-0.048
 Deal-0.047
umpy-0.041
 tant-0.040
Token!
Token ID0
Feature activation-1.05e+00

Pos logit contributions
ust+0.247
irk+0.218
oker+0.207
 loser+0.200
 mind+0.199
Neg logit contributions
lf-0.180
 airplane-0.149
 then-0.145
 parked-0.143
 set-0.142
Token<|endoftext|>
Token ID50256
Feature activation+4.34e+00

Pos logit contributions
ust+0.037
irk+0.031
 possible+0.030
oker+0.030
 forward+0.029
Neg logit contributions
lf-0.024
 parked-0.019
 cardboard-0.019
 airplane-0.019
 then-0.019
TokenOnce
Token ID7454
Feature activation-2.76e+00

Pos logit contributions
lf+0.121
 airplane+0.102
 Year+0.099
 parked+0.098
 cardboard+0.098
Neg logit contributions
ust-0.131
irk-0.110
oker-0.104
 forward-0.103
 mind-0.102
Token upon
Token ID2402
Feature activation+5.65e+00

Pos logit contributions
ust+0.077
irk+0.063
oker+0.061
 loser+0.058
beh+0.058
Neg logit contributions
lf-0.084
 airplane-0.069
 cardboard-0.068
 parked-0.068
 then-0.067
Token a
Token ID257
Feature activation+3.42e+00

Pos logit contributions
lf+0.179
cut+0.159
 Year+0.159
 airplane+0.156
 mill+0.148
Neg logit contributions
ust-0.150
 mind-0.118
 OK-0.111
irk-0.111
 much-0.108
Token time
Token ID640
Feature activation-3.08e+00

Pos logit contributions
lf+0.116
 cardboard+0.096
 Year+0.096
 then+0.096
 set+0.096
Neg logit contributions
ust-0.080
irk-0.065
 possible-0.064
oker-0.063
 forward-0.062
Token,
Token ID11
Feature activation+1.10e+00

Pos logit contributions
ust+0.117
irk+0.103
oker+0.098
 possible+0.096
 forward+0.096
Neg logit contributions
lf-0.064
 cardboard-0.047
 parked-0.046
 airplane-0.046
 then-0.045
Token there
Token ID612
Feature activation+4.02e+00

Pos logit contributions
lf+0.034
 airplane+0.029
 Year+0.029
 walls+0.027
 parked+0.027
Neg logit contributions
ust-0.031
 forward-0.024
 mind-0.024
irk-0.023
oker-0.023
Token was
Token ID373
Feature activation+2.15e+00

Pos logit contributions
lf+0.086
 airplane+0.084
 Year+0.080
cut+0.072
 cardboard+0.070
Neg logit contributions
ust-0.136
 mind-0.116
irk-0.112
 anymore-0.110
oker-0.109
Token,
Token ID11
Feature activation+1.26e+00

Pos logit contributions
ust+0.008
irk+0.006
oker+0.006
aring+0.006
umpy+0.006
Neg logit contributions
lf-0.005
 then-0.004
 airplane-0.004
 walls-0.004
 cardboard-0.004
Token showing
Token ID4478
Feature activation-4.60e+00

Pos logit contributions
lf+0.035
 airplane+0.029
 Year+0.029
 parked+0.028
 cardboard+0.028
Neg logit contributions
ust-0.038
irk-0.031
 forward-0.030
 mind-0.030
oker-0.030
Token off
Token ID572
Feature activation-3.58e+00

Pos logit contributions
ust+0.167
irk+0.143
oker+0.141
umpy+0.134
 loser+0.133
Neg logit contributions
lf-0.101
 airplane-0.087
 then-0.083
 walls-0.082
 set-0.080
Token her
Token ID607
Feature activation-3.87e+00

Pos logit contributions
ust+0.133
irk+0.117
oker+0.116
 loser+0.112
umpy+0.112
Neg logit contributions
lf-0.088
 walls-0.071
 cardboard-0.069
 airplane-0.069
 set-0.066
Token dress
Token ID6576
Feature activation-5.86e-01

Pos logit contributions
ust+0.122
irk+0.102
 mind+0.097
oker+0.097
 forward+0.096
Neg logit contributions
lf-0.104
 parked-0.085
 airplane-0.085
 Year-0.084
 cardboard-0.082
Token.
Token ID13
Feature activation-1.35e+00

Pos logit contributions
ust+0.018
irk+0.015
oker+0.014
 possible+0.013
 loser+0.013
Neg logit contributions
lf-0.017
 airplane-0.014
 parked-0.014
 set-0.014
 walls-0.014
Token Then
Token ID3244
Feature activation+5.01e+00

Pos logit contributions
ust+0.040
irk+0.033
 mind+0.032
 forward+0.032
oker+0.031
Neg logit contributions
lf-0.039
 airplane-0.033
 parked-0.031
 Year-0.031
 cardboard-0.031
Token she
Token ID673
Feature activation-1.58e+00

Pos logit contributions
lf+0.121
 airplane+0.097
 parked+0.095
 Year+0.094
 cardboard+0.094
Neg logit contributions
ust-0.175
irk-0.148
oker-0.142
 forward-0.139
 mind-0.138
Token went
Token ID1816
Feature activation+3.24e-01

Pos logit contributions
ust+0.053
irk+0.044
 mind+0.043
 forward+0.043
oker+0.041
Neg logit contributions
lf-0.036
cut-0.031
 Year-0.031
 cardboard-0.029
 airplane-0.027
Token to
Token ID284
Feature activation+1.47e+00

Pos logit contributions
lf+0.010
 airplane+0.008
 parked+0.008
 cardboard+0.008
 Year+0.008
Neg logit contributions
ust-0.009
irk-0.008
oker-0.007
 possible-0.007
 mind-0.007
Token show
Token ID905
Feature activation-6.70e+00

Pos logit contributions
lf+0.041
 Year+0.034
cut+0.034
 airplane+0.033
 parked+0.033
Neg logit contributions
ust-0.042
 mind-0.037
irk-0.036
 forward-0.035
 possible-0.034
Token weak
Token ID4939
Feature activation+1.47e+00

Pos logit contributions
ust+0.155
irk+0.135
oker+0.126
osa+0.122
aring+0.118
Neg logit contributions
lf-0.150
 parked-0.132
 cardboard-0.129
 set-0.127
 then-0.127
Token.
Token ID13
Feature activation-1.44e+00

Pos logit contributions
lf+0.029
 airplane+0.024
 cardboard+0.023
 then+0.023
 parked+0.023
Neg logit contributions
ust-0.058
irk-0.050
oker-0.049
rys-0.047
olding-0.046
Token Even
Token ID3412
Feature activation-3.25e+00

Pos logit contributions
ust+0.045
irk+0.038
oker+0.036
 mind+0.036
 forward+0.036
Neg logit contributions
lf-0.039
 airplane-0.032
 parked-0.031
 Year-0.031
 cardboard-0.030
Token the
Token ID262
Feature activation+2.03e+00

Pos logit contributions
ust+0.119
irk+0.104
oker+0.099
aring+0.096
olding+0.095
Neg logit contributions
lf-0.075
 airplane-0.061
 parked-0.061
 cardboard-0.060
 then-0.060
Token straw
Token ID14787
Feature activation-3.73e-01

Pos logit contributions
lf+0.045
 airplane+0.034
 Year+0.034
 parked+0.034
 cardboard+0.033
Neg logit contributions
ust-0.073
irk-0.062
 mind-0.061
 forward-0.060
oker-0.060
Token was
Token ID373
Feature activation+1.97e+00

Pos logit contributions
ust+0.014
irk+0.012
oker+0.012
 mind+0.012
 forward+0.012
Neg logit contributions
lf-0.007
 airplane-0.006
 Year-0.005
 parked-0.005
 cardboard-0.005
Token fragile
Token ID21049
Feature activation+1.21e+00

Pos logit contributions
lf+0.053
 parked+0.048
 then+0.046
 airplane+0.045
 cardboard+0.045
Neg logit contributions
ust-0.061
irk-0.055
oker-0.053
aring-0.048
osa-0.048
Token and
Token ID290
Feature activation+1.69e+00

Pos logit contributions
lf+0.028
 airplane+0.023
 cardboard+0.022
 parked+0.022
 then+0.022
Neg logit contributions
ust-0.043
irk-0.037
oker-0.036
 forward-0.034
 possible-0.034
Token the
Token ID262
Feature activation+2.05e+00

Pos logit contributions
lf+0.044
 airplane+0.036
 parked+0.035
 cardboard+0.035
 Year+0.035
Neg logit contributions
ust-0.055
irk-0.046
oker-0.045
 forward-0.044
 mind-0.043
Token stable
Token ID8245
Feature activation+2.28e+00

Pos logit contributions
lf+0.043
 airplane+0.033
 parked+0.032
 Year+0.032
 cardboard+0.031
Neg logit contributions
ust-0.077
irk-0.066
oker-0.064
 mind-0.063
 forward-0.063
Token began
Token ID2540
Feature activation+1.45e+00

Pos logit contributions
lf+0.036
 airplane+0.025
 Year+0.025
 parked+0.024
 cardboard+0.024
Neg logit contributions
ust-0.097
irk-0.085
oker-0.082
 forward-0.081
 mind-0.081
Token,
Token ID11
Feature activation+1.07e+00

Pos logit contributions
lf+0.010
 airplane+0.008
 parked+0.008
 Year+0.008
 cardboard+0.008
Neg logit contributions
ust-0.013
irk-0.011
oker-0.011
 possible-0.011
 mind-0.010
Token Tom
Token ID4186
Feature activation+1.77e+00

Pos logit contributions
lf+0.036
 Year+0.030
 airplane+0.030
 parked+0.030
cut+0.029
Neg logit contributions
ust-0.027
irk-0.021
 mind-0.020
oker-0.020
 possible-0.020
Token,
Token ID11
Feature activation+1.08e+00

Pos logit contributions
lf+0.044
 airplane+0.035
 parked+0.035
 cardboard+0.034
 Year+0.034
Neg logit contributions
ust-0.060
irk-0.051
oker-0.049
 forward-0.048
 possible-0.047
Token I
Token ID314
Feature activation-2.38e+00

Pos logit contributions
lf+0.033
cut+0.028
 Year+0.028
 parked+0.027
rus+0.026
Neg logit contributions
ust-0.028
 mind-0.024
 possible-0.023
oker-0.022
 OK-0.022
Token have
Token ID423
Feature activation+1.15e+00

Pos logit contributions
ust+0.072
irk+0.060
 mind+0.059
oker+0.059
 forward+0.056
Neg logit contributions
lf-0.069
 Year-0.058
 airplane-0.057
cut-0.055
 parked-0.055
Token a
Token ID257
Feature activation+3.21e+00

Pos logit contributions
lf+0.031
 parked+0.024
 Year+0.024
cut+0.024
 airplane+0.024
Neg logit contributions
ust-0.037
irk-0.030
 mind-0.030
 possible-0.029
 forward-0.029
Token new
Token ID649
Feature activation-3.20e+00

Pos logit contributions
lf+0.090
 parked+0.076
 Year+0.075
cut+0.070
 airplane+0.068
Neg logit contributions
ust-0.096
 mind-0.084
irk-0.083
 possible-0.078
 forward-0.078
Token pen
Token ID3112
Feature activation+4.85e+00

Pos logit contributions
ust+0.118
 mind+0.110
irk+0.109
 possible+0.105
 common+0.104
Neg logit contributions
lf-0.063
 parked-0.051
 Year-0.050
cut-0.042
 then-0.041
Token.
Token ID13
Feature activation-1.57e+00

Pos logit contributions
lf+0.142
 airplane+0.120
 parked+0.117
 cardboard+0.116
 Year+0.114
Neg logit contributions
ust-0.144
irk-0.118
oker-0.111
 mind-0.108
 forward-0.108
Token It
Token ID632
Feature activation+4.86e+00

Pos logit contributions
ust+0.042
irk+0.036
 mind+0.035
 forward+0.035
oker+0.034
Neg logit contributions
lf-0.050
 airplane-0.040
 Year-0.040
cut-0.039
 parked-0.038
Token is
Token ID318
Feature activation-1.26e-02

Pos logit contributions
lf+0.120
 airplane+0.094
 Year+0.094
 parked+0.093
 cardboard+0.090
Neg logit contributions
ust-0.165
irk-0.138
oker-0.134
 mind-0.133
 forward-0.132
Tokenily
Token ID813
Feature activation+8.37e-01

Pos logit contributions
ust+0.113
irk+0.082
oker+0.076
olding+0.071
 mind+0.070
Neg logit contributions
lf-0.196
 airplane-0.175
 Year-0.172
 parked-0.172
 cardboard-0.171
Token had
Token ID550
Feature activation+4.10e+00

Pos logit contributions
 Year+0.019
cut+0.019
 airplane+0.019
lf+0.019
 cardboard+0.019
Neg logit contributions
ust-0.023
osa-0.019
 mind-0.019
 persuasive-0.017
a-0.017
Token so
Token ID523
Feature activation-1.31e+00

Pos logit contributions
lf+0.135
cut+0.123
 Year+0.113
rus+0.110
 airplane+0.109
Neg logit contributions
ust-0.099
 mind-0.084
 possible-0.081
 OK-0.076
 forward-0.076
Token much
Token ID881
Feature activation-8.69e+00

Pos logit contributions
ust+0.021
irk+0.014
oker+0.013
 possible+0.012
 forward+0.012
Neg logit contributions
lf-0.055
 airplane-0.049
 parked-0.049
 cardboard-0.049
 Year-0.048
Token fun
Token ID1257
Feature activation-4.43e+00

Pos logit contributions
ust+0.199
irk+0.152
oker+0.142
 forward+0.138
 mind+0.136
Neg logit contributions
lf-0.313
 airplane-0.271
 parked-0.269
 cardboard-0.268
 Year-0.267
Token jumping
Token ID14284
Feature activation-1.35e+00

Pos logit contributions
ust+0.140
irk+0.116
oker+0.115
 forward+0.113
 mind+0.113
Neg logit contributions
lf-0.111
 airplane-0.093
 Year-0.093
 parked-0.093
 cardboard-0.089
Token and
Token ID290
Feature activation+1.69e+00

Pos logit contributions
ust+0.042
irk+0.034
oker+0.033
 mind+0.032
 forward+0.032
Neg logit contributions
lf-0.038
 airplane-0.031
 parked-0.031
 Year-0.031
 cardboard-0.030
Token rolling
Token ID10708
Feature activation-1.00e+00

Pos logit contributions
lf+0.048
 Year+0.044
 airplane+0.043
 cardboard+0.041
 engine+0.040
Neg logit contributions
ust-0.044
 mind-0.037
 possible-0.037
 forward-0.037
irk-0.035
Token around
Token ID1088
Feature activation-2.59e-01

Pos logit contributions
ust+0.036
irk+0.030
oker+0.029
olding+0.028
aring+0.028
Neg logit contributions
lf-0.025
 then-0.020
 airplane-0.020
 walls-0.020
 cardboard-0.020
Token on
Token ID319
Feature activation-3.57e+00

Pos logit contributions
ust+0.008
irk+0.006
oker+0.006
 forward+0.006
 mind+0.006
Neg logit contributions
lf-0.007
 airplane-0.006
 parked-0.006
 Year-0.006
 cardboard-0.006
Token the
Token ID262
Feature activation+2.01e+00

Pos logit contributions
ust+0.108
 mind+0.091
irk+0.090
 forward+0.089
 possible+0.087
Neg logit contributions
lf-0.092
 parked-0.078
 Year-0.075
 airplane-0.074
cut-0.073
TokenOne
Token ID3198
Feature activation-1.91e+00

Pos logit contributions
ust+0.050
irk+0.036
oker+0.036
 forward+0.036
 possible+0.035
Neg logit contributions
lf-0.049
 airplane-0.044
 Year-0.043
 cardboard-0.041
 parked-0.041
Token day
Token ID1110
Feature activation+1.97e+00

Pos logit contributions
ust+0.081
irk+0.070
 unnecessary+0.068
beh+0.066
 fault+0.066
Neg logit contributions
lf-0.035
 set-0.027
 then-0.027
 parked-0.026
 store-0.025
Token,
Token ID11
Feature activation+1.28e+00

Pos logit contributions
lf+0.037
 cardboard+0.027
 then+0.027
 set+0.026
 airplane+0.025
Neg logit contributions
ust-0.084
irk-0.074
oker-0.068
 possible-0.068
umpy-0.068
Token Lily
Token ID20037
Feature activation+2.15e-01

Pos logit contributions
lf+0.039
 airplane+0.033
 parked+0.033
 cardboard+0.033
 then+0.032
Neg logit contributions
ust-0.036
irk-0.030
oker-0.028
 possible-0.027
 forward-0.027
Token's
Token ID338
Feature activation-2.14e+00

Pos logit contributions
lf+0.005
cut+0.004
 cardboard+0.004
 Year+0.004
 airplane+0.004
Neg logit contributions
ust-0.007
irk-0.006
 mind-0.006
oker-0.006
 forward-0.006
Token dad
Token ID9955
Feature activation+4.77e-01

Pos logit contributions
ust+0.066
irk+0.055
oker+0.053
 forward+0.051
 possible+0.051
Neg logit contributions
lf-0.059
 airplane-0.049
 parked-0.048
 Year-0.048
 cardboard-0.048
Token brought
Token ID3181
Feature activation-1.04e+00

Pos logit contributions
lf+0.011
 Year+0.009
 cardboard+0.009
 airplane+0.009
 walls+0.008
Neg logit contributions
ust-0.016
irk-0.014
oker-0.013
 forward-0.013
 mind-0.013
Token home
Token ID1363
Feature activation-1.21e+00

Pos logit contributions
ust+0.040
oker+0.034
aring+0.033
irk+0.033
umpy+0.033
Neg logit contributions
lf-0.020
 set-0.020
 airplane-0.019
 cardboard-0.019
 then-0.018
Token a
Token ID257
Feature activation+3.36e+00

Pos logit contributions
ust+0.054
irk+0.048
rys+0.047
oker+0.046
aring+0.045
Neg logit contributions
lf-0.019
 set-0.016
 then-0.015
 parked-0.014
 cardboard-0.014
Token surprise
Token ID5975
Feature activation-1.48e+00

Pos logit contributions
lf+0.096
 airplane+0.081
 cardboard+0.079
 parked+0.079
 Year+0.078
Neg logit contributions
ust-0.101
irk-0.083
oker-0.079
 forward-0.077
 possible-0.077
Token for
Token ID329
Feature activation-4.46e+00

Pos logit contributions
ust+0.041
irk+0.032
oker+0.030
umpy+0.030
umble+0.029
Neg logit contributions
lf-0.049
 airplane-0.042
 parked-0.041
 set-0.040
 cardboard-0.040
Token explore
Token ID7301
Feature activation+7.32e-01

Pos logit contributions
lf+0.032
 Year+0.027
 airplane+0.026
 cardboard+0.025
 parked+0.025
Neg logit contributions
ust-0.031
 mind-0.026
 forward-0.025
 possible-0.025
oker-0.024
Token the
Token ID262
Feature activation+1.74e+00

Pos logit contributions
lf+0.019
 airplane+0.015
 parked+0.015
 Year+0.015
 cardboard+0.015
Neg logit contributions
ust-0.024
irk-0.020
oker-0.019
 forward-0.019
 mind-0.019
Token world
Token ID995
Feature activation-1.78e+00

Pos logit contributions
lf+0.042
 parked+0.033
 airplane+0.033
 Year+0.033
 cardboard+0.031
Neg logit contributions
ust-0.061
irk-0.050
 forward-0.049
 mind-0.049
oker-0.049
Token around
Token ID1088
Feature activation-5.68e-01

Pos logit contributions
ust+0.051
irk+0.041
oker+0.039
 mind+0.037
 loser+0.037
Neg logit contributions
lf-0.055
 airplane-0.046
 parked-0.046
 cardboard-0.045
 walls-0.045
Token him
Token ID683
Feature activation-2.86e+00

Pos logit contributions
ust+0.024
irk+0.021
oker+0.020
aring+0.020
olding+0.020
Neg logit contributions
lf-0.010
 set-0.007
 then-0.007
 walls-0.007
 airplane-0.007
Token.
Token ID13
Feature activation-1.72e+00

Pos logit contributions
ust+0.075
irk+0.060
oker+0.057
 forward+0.056
 mind+0.055
Neg logit contributions
lf-0.093
 airplane-0.079
 parked-0.078
 Year-0.078
 cardboard-0.077
Token So
Token ID1406
Feature activation+1.73e+00

Pos logit contributions
ust+0.055
irk+0.046
oker+0.044
 forward+0.043
 mind+0.043
Neg logit contributions
lf-0.046
 airplane-0.038
 parked-0.037
 Year-0.037
 cardboard-0.037
Token he
Token ID339
Feature activation-1.20e+00

Pos logit contributions
lf+0.046
 Year+0.039
 airplane+0.038
 parked+0.038
 cardboard+0.036
Neg logit contributions
ust-0.053
 mind-0.044
 forward-0.044
irk-0.043
 possible-0.043
Token decided
Token ID3066
Feature activation+7.12e+00

Pos logit contributions
ust+0.043
 mind+0.035
 forward+0.034
irk+0.033
irm+0.032
Neg logit contributions
lf-0.027
 Year-0.022
 cardboard-0.021
 airplane-0.021
 fridge-0.021
Token to
Token ID284
Feature activation+1.10e+00

Pos logit contributions
lf+0.261
 Year+0.228
 cardboard+0.222
 airplane+0.221
 parked+0.216
Neg logit contributions
ust-0.154
 possible-0.114
 forward-0.112
irk-0.111
 mind-0.110
Token march
Token ID9960
Feature activation-9.59e-01

Pos logit contributions
lf+0.031
 Year+0.025
 airplane+0.025
 parked+0.025
 cardboard+0.024
Neg logit contributions
ust-0.033
 mind-0.027
irk-0.026
oker-0.026
 possible-0.025
Token
Token ID198
Feature activation-2.06e+00

Pos logit contributions
ust+0.042
irk+0.038
 mind+0.032
olding+0.030
umpy+0.030
Neg logit contributions
lf-0.073
 airplane-0.069
 Year-0.065
 parked-0.064
 engine-0.063
TokenL
Token ID43
Feature activation-5.46e+00

Pos logit contributions
ust+0.051
irk+0.040
oker+0.036
olding+0.036
 mind+0.036
Neg logit contributions
lf-0.070
 airplane-0.064
 Year-0.062
 parked-0.061
 cardboard-0.060
Tokenily
Token ID813
Feature activation+8.45e-01

Pos logit contributions
ust+0.122
irk+0.093
oker+0.086
 mind+0.081
 forward+0.081
Neg logit contributions
lf-0.201
 airplane-0.178
 Year-0.175
 parked-0.174
 cardboard-0.173
Token looked
Token ID3114
Feature activation+8.11e-01

Pos logit contributions
lf+0.023
 airplane+0.020
 cardboard+0.020
 Year+0.019
cut+0.018
Neg logit contributions
ust-0.026
irk-0.021
 mind-0.021
olding-0.019
aring-0.019
Token at
Token ID379
Feature activation-3.62e+00

Pos logit contributions
 then+0.023
lf+0.022
 airplane+0.022
 set+0.021
 cardboard+0.021
Neg logit contributions
ust-0.023
oker-0.020
 possible-0.019
 loser-0.019
umble-0.018
Token her
Token ID607
Feature activation-3.99e+00

Pos logit contributions
ust+0.084
irk+0.070
 mind+0.068
 forward+0.065
 possible+0.063
Neg logit contributions
lf-0.123
 Year-0.108
 parked-0.104
 airplane-0.103
 cardboard-0.100
Token mom
Token ID1995
Feature activation+7.26e-02

Pos logit contributions
ust+0.143
irk+0.122
oker+0.117
 mind+0.116
 forward+0.116
Neg logit contributions
lf-0.091
 airplane-0.072
 parked-0.071
 Year-0.071
 cardboard-0.069
Token and
Token ID290
Feature activation+1.35e+00

Pos logit contributions
lf+0.002
 airplane+0.001
 parked+0.001
 Year+0.001
 cardboard+0.001
Neg logit contributions
ust-0.002
irk-0.002
oker-0.002
 possible-0.002
 forward-0.002
Token then
Token ID788
Feature activation+5.72e+00

Pos logit contributions
lf+0.027
 airplane+0.020
 parked+0.020
 then+0.020
 cardboard+0.019
Neg logit contributions
ust-0.054
irk-0.046
oker-0.045
 possible-0.044
 forward-0.044
Token at
Token ID379
Feature activation-3.61e+00

Pos logit contributions
 then+0.132
 set+0.106
lf+0.102
 showed+0.101
 parked+0.100
Neg logit contributions
ust-0.225
 loser-0.189
oker-0.186
umpy-0.186
umble-0.177
Token her
Token ID607
Feature activation-3.97e+00

Pos logit contributions
ust+0.112
irk+0.096
 mind+0.092
 forward+0.091
oker+0.091
Neg logit contributions
lf-0.098
 airplane-0.079
 Year-0.079
 parked-0.078
 cardboard-0.077
Tokenâ
Token ID22940
Feature activation-1.04e+00

Pos logit contributions
lf+0.192
 airplane+0.149
 parked+0.146
 Year+0.144
 cardboard+0.142
Neg logit contributions
ust-0.345
irk-0.297
oker-0.287
 forward-0.281
 mind-0.281
Token
Token ID26391
Feature activation+1.30e+00

Pos logit contributions
ust+0.030
irk+0.024
oker+0.023
 forward+0.022
 possible+0.022
Neg logit contributions
lf-0.031
 parked-0.026
 airplane-0.026
 Year-0.026
 cardboard-0.025
Token
Token ID8151
Feature activation-2.45e+00

Pos logit contributions
lf+0.039
 airplane+0.033
 cardboard+0.032
 Year+0.032
 parked+0.032
Neg logit contributions
ust-0.037
irk-0.030
oker-0.029
 mind-0.028
 forward-0.028
Tokent
Token ID83
Feature activation-3.20e+00

Pos logit contributions
ust+0.065
rys+0.059
irk+0.058
 unnecessary+0.058
 common+0.055
Neg logit contributions
lf-0.068
 parked-0.063
 then-0.058
on-0.058
ons-0.056
Token decide
Token ID5409
Feature activation+1.83e+00

Pos logit contributions
ust+0.087
irk+0.073
oker+0.070
osa+0.067
elly+0.067
Neg logit contributions
lf-0.098
 then-0.088
 set-0.087
 parked-0.086
 airplane-0.086
Token where
Token ID810
Feature activation-2.52e+00

Pos logit contributions
lf+0.045
 airplane+0.036
 parked+0.035
 Year+0.035
 cardboard+0.035
Neg logit contributions
ust-0.062
irk-0.053
oker-0.051
 forward-0.050
 mind-0.050
Token to
Token ID284
Feature activation+1.22e+00

Pos logit contributions
ust+0.076
irk+0.063
oker+0.061
 mind+0.061
 possible+0.060
Neg logit contributions
lf-0.069
 Year-0.057
 airplane-0.057
 parked-0.057
 cardboard-0.055
Token go
Token ID467
Feature activation-3.00e+00

Pos logit contributions
lf+0.034
 Year+0.029
 airplane+0.028
 cardboard+0.027
 parked+0.027
Neg logit contributions
ust-0.036
 mind-0.030
irk-0.028
oker-0.028
 possible-0.028
Token so
Token ID523
Feature activation-1.21e+00

Pos logit contributions
ust+0.103
irk+0.086
oker+0.083
 loser+0.079
osa+0.078
Neg logit contributions
lf-0.074
 airplane-0.063
 parked-0.063
 set-0.061
 cardboard-0.061
Token he
Token ID339
Feature activation-1.06e+00

Pos logit contributions
ust+0.044
irk+0.039
oker+0.037
aring+0.036
ashes+0.035
Neg logit contributions
lf-0.026
 then-0.022
 parked-0.021
 airplane-0.021
 cardboard-0.020
Token asked
Token ID1965
Feature activation+2.98e+00

Pos logit contributions
ust+0.041
irk+0.035
 mind+0.034
oker+0.034
 forward+0.034
Neg logit contributions
lf-0.021
 Year-0.016
 airplane-0.016
 cardboard-0.015
 parked-0.015
Token made
Token ID925
Feature activation+6.46e-01

Pos logit contributions
ust+0.090
irk+0.073
oker+0.069
 mind+0.069
 forward+0.068
Neg logit contributions
lf-0.095
 airplane-0.079
 Year-0.078
 parked-0.077
 cardboard-0.077
Token a
Token ID257
Feature activation+3.03e+00

Pos logit contributions
lf+0.019
 parked+0.016
cut+0.014
 airplane+0.014
 Year+0.014
Neg logit contributions
ust-0.019
 possible-0.015
 mind-0.015
irk-0.015
 forward-0.015
Token feast
Token ID26951
Feature activation-1.35e+00

Pos logit contributions
lf+0.079
 airplane+0.068
 cardboard+0.066
 Year+0.064
 parked+0.063
Neg logit contributions
ust-0.097
irk-0.082
oker-0.078
 forward-0.076
 possible-0.075
Token for
Token ID329
Feature activation-4.69e+00

Pos logit contributions
ust+0.051
irk+0.045
 unnecessary+0.043
rys+0.043
irm+0.041
Neg logit contributions
lf-0.031
 set-0.024
 then-0.024
 airplane-0.023
 parked-0.022
Token the
Token ID262
Feature activation+1.72e+00

Pos logit contributions
ust+0.159
irk+0.135
oker+0.129
 forward+0.125
umpy+0.122
Neg logit contributions
lf-0.120
 airplane-0.098
 cardboard-0.097
 Year-0.097
 then-0.096
Token animals
Token ID4695
Feature activation-2.41e+00

Pos logit contributions
lf+0.040
 parked+0.030
 Year+0.030
 airplane+0.029
 cardboard+0.028
Neg logit contributions
ust-0.061
irk-0.052
 mind-0.051
 possible-0.050
oker-0.050
Token!"
Token ID2474
Feature activation+3.05e+00

Pos logit contributions
ust+0.093
irk+0.086
rys+0.078
oker+0.077
 unnecessary+0.076
Neg logit contributions
lf-0.047
 airplane-0.041
 then-0.039
 set-0.038
 parked-0.038
Token
Token ID198
Feature activation-2.17e+00

Pos logit contributions
lf+0.078
 airplane+0.064
 parked+0.064
 Year+0.063
 cardboard+0.063
Neg logit contributions
ust-0.101
irk-0.083
oker-0.081
 forward-0.079
 possible-0.078
Token
Token ID198
Feature activation-2.20e+00

Pos logit contributions
ust+0.059
irk+0.050
oker+0.046
 mind+0.046
 forward+0.045
Neg logit contributions
lf-0.067
 airplane-0.057
 Year-0.055
 parked-0.054
 cardboard-0.054
TokenS
Token ID50
Feature activation-6.31e+00

Pos logit contributions
ust+0.064
irk+0.053
oker+0.051
ashes+0.051
apping+0.051
Neg logit contributions
lf-0.064
 airplane-0.054
 cardboard-0.052
 Year-0.051
 fridge-0.051
Tokenara
Token ID3301
Feature activation-1.63e+00

Pos logit contributions
ust+0.171
irk+0.139
oker+0.132
 forward+0.127
 mind+0.126
Neg logit contributions
lf-0.200
 airplane-0.170
 parked-0.167
 Year-0.165
 cardboard-0.165
Token thin
Token ID7888
Feature activation+1.73e+00

Pos logit contributions
lf+0.184
 Year+0.159
 airplane+0.147
pped+0.147
ons+0.144
Neg logit contributions
ust-0.137
 mind-0.128
irk-0.121
 loser-0.117
umpy-0.115
Token dog
Token ID3290
Feature activation+2.38e+00

Pos logit contributions
lf+0.044
 Year+0.037
 airplane+0.035
 parked+0.035
 cardboard+0.034
Neg logit contributions
ust-0.053
irk-0.046
 mind-0.046
 loser-0.044
oker-0.044
Token looked
Token ID3114
Feature activation+7.97e-01

Pos logit contributions
lf+0.046
 Year+0.035
 airplane+0.035
 cardboard+0.034
 parked+0.033
Neg logit contributions
ust-0.092
irk-0.079
 mind-0.077
oker-0.075
 forward-0.075
Token sad
Token ID6507
Feature activation+1.98e+00

Pos logit contributions
lf+0.021
 then+0.020
 parked+0.020
 airplane+0.020
 set+0.020
Neg logit contributions
ust-0.024
oker-0.020
irk-0.019
 possible-0.019
umble-0.019
Token.
Token ID13
Feature activation-1.51e+00

Pos logit contributions
lf+0.045
 then+0.043
 airplane+0.040
 set+0.039
 cardboard+0.039
Neg logit contributions
ust-0.066
oker-0.059
irk-0.057
rys-0.055
beh-0.053
Token
Token ID198
Feature activation-1.94e+00

Pos logit contributions
ust+0.040
irk+0.033
 mind+0.032
oker+0.031
 forward+0.030
Neg logit contributions
lf-0.048
 airplane-0.042
 Year-0.039
 cardboard-0.039
 parked-0.038
Token
Token ID198
Feature activation-1.96e+00

Pos logit contributions
ust+0.051
irk+0.043
 mind+0.039
oker+0.039
 forward+0.038
Neg logit contributions
lf-0.060
 airplane-0.055
 Year-0.052
 cardboard-0.051
 parked-0.051
TokenThe
Token ID464
Feature activation+5.14e+00

Pos logit contributions
ust+0.057
umpy+0.047
 mind+0.046
 common+0.045
aring+0.045
Neg logit contributions
 airplane-0.052
lf-0.051
 Year-0.047
 fridge-0.046
 cardboard-0.045
Token fat
Token ID3735
Feature activation+1.31e+00

Pos logit contributions
lf+0.173
 Year+0.152
ons+0.146
pped+0.145
bed+0.143
Neg logit contributions
ust-0.109
 mind-0.106
 loser-0.097
irk-0.092
 good-0.092
Token cat
Token ID3797
Feature activation+4.13e+00

Pos logit contributions
lf+0.034
 Year+0.029
 airplane+0.028
 parked+0.027
 cardboard+0.027
Neg logit contributions
ust-0.039
irk-0.035
 mind-0.034
 loser-0.033
umpy-0.033
Token wanted
Token ID2227
Feature activation+5.06e+00

Pos logit contributions
lf+0.097
 Year+0.084
 airplane+0.078
 cardboard+0.078
cut+0.074
Neg logit contributions
ust-0.136
 mind-0.113
irk-0.111
 forward-0.105
aring-0.104
Token,
Token ID11
Feature activation+8.70e-01

Pos logit contributions
lf+0.036
 then+0.033
 set+0.033
 airplane+0.030
 parked+0.029
Neg logit contributions
ust-0.057
irk-0.052
oker-0.046
umble-0.046
rys-0.045
Token Lily
Token ID20037
Feature activation-2.91e-01

Pos logit contributions
lf+0.026
 airplane+0.021
 parked+0.021
 Year+0.021
 cardboard+0.021
Neg logit contributions
ust-0.025
irk-0.020
oker-0.020
 mind-0.019
 forward-0.019
Token."
Token ID526
Feature activation+1.10e+00

Pos logit contributions
ust+0.009
irk+0.008
oker+0.007
 forward+0.007
 mind+0.007
Neg logit contributions
lf-0.008
 airplane-0.007
 parked-0.007
 Year-0.007
 set-0.006
Token And
Token ID843
Feature activation-1.26e+00

Pos logit contributions
lf+0.027
 airplane+0.022
 Year+0.022
 parked+0.022
 then+0.022
Neg logit contributions
ust-0.037
irk-0.031
oker-0.030
 possible-0.030
 forward-0.029
Token they
Token ID484
Feature activation-3.34e+00

Pos logit contributions
ust+0.045
irk+0.039
oker+0.037
 possible+0.035
umpy+0.035
Neg logit contributions
lf-0.030
 then-0.025
 airplane-0.023
 parked-0.023
 set-0.023
Token all
Token ID477
Feature activation-8.05e+00

Pos logit contributions
ust+0.114
irk+0.094
 forward+0.093
 mind+0.092
oker+0.089
Neg logit contributions
lf-0.080
 Year-0.066
 airplane-0.063
 cardboard-0.063
 walls-0.061
Token went
Token ID1816
Feature activation+1.50e-01

Pos logit contributions
ust+0.321
irk+0.279
oker+0.270
 forward+0.266
 mind+0.265
Neg logit contributions
lf-0.150
 airplane-0.112
 parked-0.109
 Year-0.108
 cardboard-0.106
Token home
Token ID1363
Feature activation-1.58e+00

Pos logit contributions
lf+0.004
 airplane+0.004
 set+0.004
 parked+0.004
 cardboard+0.004
Neg logit contributions
ust-0.005
irk-0.004
oker-0.004
 loser-0.004
umpy-0.004
Token happy
Token ID3772
Feature activation+1.43e+00

Pos logit contributions
ust+0.061
irk+0.051
rys+0.050
oker+0.050
 unnecessary+0.049
Neg logit contributions
lf-0.037
 then-0.031
 set-0.029
 parked-0.029
 airplane-0.028
Token.
Token ID13
Feature activation-1.86e+00

Pos logit contributions
lf+0.034
 airplane+0.027
 then+0.027
 Year+0.026
 parked+0.026
Neg logit contributions
ust-0.052
irk-0.044
oker-0.042
 forward-0.041
 possible-0.041
Token<|endoftext|>
Token ID50256
Feature activation+3.79e+00

Pos logit contributions
ust+0.066
irk+0.056
oker+0.054
 possible+0.053
 forward+0.053
Neg logit contributions
lf-0.043
 parked-0.035
 then-0.034
 Year-0.034
 airplane-0.034
Token mom
Token ID1995
Feature activation-6.27e-01

Pos logit contributions
lf+0.023
 then+0.021
 airplane+0.021
 cardboard+0.019
 store+0.019
Neg logit contributions
ust-0.031
beh-0.026
irk-0.026
ashing-0.025
umble-0.025
Token.
Token ID13
Feature activation-1.55e+00

Pos logit contributions
ust+0.018
irk+0.014
oker+0.014
 forward+0.013
 possible+0.013
Neg logit contributions
lf-0.019
 airplane-0.016
 parked-0.016
 Year-0.016
 cardboard-0.016
Token I
Token ID314
Feature activation-1.97e+00

Pos logit contributions
ust+0.052
irk+0.042
 forward+0.040
 possible+0.040
oker+0.040
Neg logit contributions
lf-0.040
 Year-0.034
 cardboard-0.033
 then-0.033
 airplane-0.033
Token love
Token ID1842
Feature activation-2.85e+00

Pos logit contributions
ust+0.071
irk+0.061
oker+0.060
 possible+0.060
 forward+0.060
Neg logit contributions
lf-0.039
 then-0.034
 parked-0.033
 set-0.033
 airplane-0.031
Token you
Token ID345
Feature activation-8.75e+00

Pos logit contributions
ust+0.104
irk+0.088
oker+0.083
 forward+0.083
umpy+0.081
Neg logit contributions
lf-0.070
 then-0.057
 airplane-0.056
 Year-0.055
 cardboard-0.055
Token too
Token ID1165
Feature activation-9.39e+00

Pos logit contributions
ust+0.284
irk+0.245
oker+0.241
rys+0.228
 possible+0.226
Neg logit contributions
lf-0.221
 then-0.196
 airplane-0.187
 set-0.184
 parked-0.183
Token.
Token ID13
Feature activation-1.78e+00

Pos logit contributions
ust+0.332
irk+0.305
oker+0.300
rys+0.291
beh+0.274
Neg logit contributions
lf-0.192
 then-0.171
 walls-0.158
 cardboard-0.153
 fridge-0.152
Token You
Token ID921
Feature activation-3.96e+00

Pos logit contributions
ust+0.060
irk+0.048
 possible+0.046
oker+0.046
 forward+0.046
Neg logit contributions
lf-0.044
 Year-0.038
 then-0.038
 cardboard-0.037
 parked-0.037
Token are
Token ID389
Feature activation-1.41e+00

Pos logit contributions
ust+0.113
irk+0.092
oker+0.088
 forward+0.086
 mind+0.086
Neg logit contributions
lf-0.120
 airplane-0.100
 parked-0.099
 Year-0.099
 cardboard-0.098
Token right
Token ID826
Feature activation-7.46e+00

Pos logit contributions
ust+0.052
irk+0.051
oker+0.049
ashes+0.048
umble+0.047
Neg logit contributions
lf-0.029
 then-0.027
 parked-0.027
 cardboard-0.025
 airplane-0.023
Token.
Token ID13
Feature activation-1.78e+00

Pos logit contributions
ust+0.215
irk+0.175
oker+0.169
 possible+0.160
 forward+0.160
Neg logit contributions
lf-0.224
 airplane-0.190
 parked-0.187
 Year-0.186
 cardboard-0.184
Token It
Token ID632
Feature activation+4.52e+00

Pos logit contributions
ust+0.035
irk+0.029
oker+0.028
 possible+0.027
 forward+0.027
Neg logit contributions
lf-0.030
 airplane-0.025
 parked-0.024
 Year-0.024
 cardboard-0.024
Token was
Token ID373
Feature activation+1.90e+00

Pos logit contributions
lf+0.062
 airplane+0.040
 parked+0.038
 Year+0.038
 cardboard+0.037
Neg logit contributions
ust-0.203
irk-0.179
oker-0.174
 forward-0.172
 mind-0.172
Token filled
Token ID5901
Feature activation+2.72e+00

Pos logit contributions
lf+0.048
 Year+0.039
 airplane+0.038
 parked+0.037
 cardboard+0.036
Neg logit contributions
ust-0.062
irk-0.051
 possible-0.051
 forward-0.050
 mind-0.050
Token with
Token ID351
Feature activation-8.30e+00

Pos logit contributions
lf+0.075
 airplane+0.060
 parked+0.060
 Year+0.059
 cardboard+0.059
Neg logit contributions
ust-0.086
irk-0.073
oker-0.069
 mind-0.067
 forward-0.066
Token ch
Token ID442
Feature activation-6.45e+00

Pos logit contributions
ust+0.281
irk+0.236
oker+0.229
 forward+0.220
 possible+0.219
Neg logit contributions
lf-0.205
 airplane-0.169
 cardboard-0.167
 parked-0.165
 Year-0.162
Tokenocol
Token ID4668
Feature activation-1.09e+01

Pos logit contributions
ust+0.178
 possible+0.139
oker+0.129
irk+0.128
 forward+0.124
Neg logit contributions
lf-0.194
 cardboard-0.179
 walls-0.174
 airplane-0.170
ons-0.169
Tokenates
Token ID689
Feature activation-1.42e+00

Pos logit contributions
ust+0.333
irk+0.273
oker+0.256
 forward+0.255
 possible+0.253
Neg logit contributions
lf-0.304
 airplane-0.256
 parked-0.252
 cardboard-0.244
 Year-0.243
Token and
Token ID290
Feature activation+1.66e+00

Pos logit contributions
ust+0.052
irk+0.046
oker+0.044
 mind+0.042
 loser+0.042
Neg logit contributions
lf-0.031
 airplane-0.025
 parked-0.025
 then-0.025
 cardboard-0.024
Token coins
Token ID10796
Feature activation-4.38e+00

Pos logit contributions
lf+0.037
 airplane+0.030
 Year+0.030
 parked+0.029
 cardboard+0.029
Neg logit contributions
ust-0.058
irk-0.049
oker-0.048
 possible-0.047
 mind-0.047
Token.
Token ID13
Feature activation-1.53e+00

Pos logit contributions
ust+0.140
irk+0.118
oker+0.114
 forward+0.108
 mind+0.108
Neg logit contributions
lf-0.117
 airplane-0.098
 parked-0.097
 cardboard-0.094
 set-0.093
Token She
Token ID1375
Feature activation+2.47e+00

Pos logit contributions
ust+0.044
irk+0.036
oker+0.035
 forward+0.034
 mind+0.034
Neg logit contributions
lf-0.045
 airplane-0.038
 parked-0.037
 Year-0.037
 cardboard-0.037
Token things
Token ID1243
Feature activation-5.38e+00

Pos logit contributions
 cardboard+0.013
lf+0.013
 then+0.013
 airplane+0.012
 walls+0.012
Neg logit contributions
ust-0.032
irk-0.027
ashing-0.027
umble-0.026
oker-0.026
Token and
Token ID290
Feature activation+1.11e+00

Pos logit contributions
ust+0.224
irk+0.208
rys+0.201
umble+0.199
oker+0.195
Neg logit contributions
 then-0.085
lf-0.079
 store-0.067
 parked-0.066
 airplane-0.061
Token keep
Token ID1394
Feature activation-4.05e+00

Pos logit contributions
lf+0.029
 then+0.025
 airplane+0.024
 parked+0.024
 cardboard+0.024
Neg logit contributions
ust-0.036
irk-0.031
oker-0.029
 forward-0.029
umble-0.028
Token them
Token ID606
Feature activation-5.73e+00

Pos logit contributions
ust+0.134
irk+0.116
oker+0.112
umpy+0.106
umble+0.103
Neg logit contributions
lf-0.105
 airplane-0.090
 cardboard-0.089
 Year-0.086
 then-0.086
Token for
Token ID329
Feature activation-4.69e+00

Pos logit contributions
irk+0.171
ust+0.171
umble+0.156
oker+0.156
rys+0.152
Neg logit contributions
lf-0.145
 airplane-0.135
 cardboard-0.131
 then-0.128
 set-0.127
Token too
Token ID1165
Feature activation-9.02e+00

Pos logit contributions
ust+0.126
 mind+0.099
 possible+0.096
irk+0.095
 forward+0.091
Neg logit contributions
lf-0.145
 airplane-0.128
 Year-0.125
 parked-0.124
cut-0.120
Token long
Token ID890
Feature activation-6.97e-01

Pos logit contributions
ust+0.235
irk+0.199
oker+0.188
rys+0.179
osa+0.174
Neg logit contributions
lf-0.280
 cardboard-0.243
 parked-0.240
 walls-0.239
 then-0.238
Token.
Token ID13
Feature activation-1.67e+00

Pos logit contributions
ust+0.021
irk+0.017
oker+0.017
 forward+0.016
 possible+0.016
Neg logit contributions
lf-0.020
 airplane-0.017
 parked-0.016
 cardboard-0.016
 Year-0.016
Token I
Token ID314
Feature activation-2.17e+00

Pos logit contributions
ust+0.051
irk+0.042
oker+0.040
 possible+0.039
 forward+0.039
Neg logit contributions
lf-0.045
 Year-0.038
 parked-0.038
 then-0.038
 airplane-0.038
Token should
Token ID815
Feature activation+8.08e+00

Pos logit contributions
ust+0.069
irk+0.058
oker+0.056
 possible+0.055
 forward+0.055
Neg logit contributions
lf-0.053
 parked-0.047
 then-0.047
 set-0.046
 cardboard-0.045
Token respect
Token ID2461
Feature activation-2.07e+00

Pos logit contributions
lf+0.233
 airplane+0.194
 parked+0.193
 cardboard+0.191
 then+0.191
Neg logit contributions
ust-0.239
irk-0.202
oker-0.191
 forward-0.188
 possible-0.181
Token and
Token ID290
Feature activation+1.80e+00

Pos logit contributions
ust+0.007
irk+0.005
oker+0.005
olding+0.005
 possible+0.005
Neg logit contributions
lf-0.004
 then-0.003
 walls-0.003
 set-0.003
 airplane-0.003
Token slow
Token ID3105
Feature activation+4.71e-01

Pos logit contributions
lf+0.042
 then+0.042
 set+0.040
 parked+0.038
 walls+0.038
Neg logit contributions
ust-0.065
irk-0.057
olding-0.052
aring-0.052
oker-0.052
Token.
Token ID13
Feature activation-1.07e+00

Pos logit contributions
 then+0.007
 set+0.007
 walls+0.007
bed+0.006
 trucks+0.006
Neg logit contributions
ust-0.020
oker-0.017
irk-0.017
olding-0.016
rys-0.016
Token They
Token ID1119
Feature activation+1.37e+00

Pos logit contributions
ust+0.037
irk+0.031
oker+0.030
 possible+0.029
beh+0.028
Neg logit contributions
lf-0.026
 parked-0.023
 then-0.022
 Year-0.021
 walls-0.021
Token bounce
Token ID20110
Feature activation+1.02e+00

Pos logit contributions
lf+0.034
 then+0.030
 set+0.029
 parked+0.029
 airplane+0.029
Neg logit contributions
ust-0.047
irk-0.039
oker-0.037
olding-0.037
aring-0.036
Token with
Token ID351
Feature activation-7.93e+00

Pos logit contributions
lf+0.017
 then+0.016
 set+0.015
 cardboard+0.014
 walls+0.014
Neg logit contributions
ust-0.042
irk-0.036
oker-0.036
olding-0.035
beh-0.034
Token joy
Token ID8716
Feature activation-6.96e+00

Pos logit contributions
ust+0.267
irk+0.218
oker+0.215
 possible+0.208
aring+0.207
Neg logit contributions
lf-0.201
 cardboard-0.175
 airplane-0.172
 then-0.168
 walls-0.167
Token and
Token ID290
Feature activation+1.87e+00

Pos logit contributions
ust+0.248
irk+0.221
oker+0.218
rys+0.200
olding+0.199
Neg logit contributions
lf-0.159
 then-0.132
 airplane-0.131
 set-0.129
 walls-0.126
Token laugh
Token ID6487
Feature activation-7.57e-01

Pos logit contributions
 then+0.052
lf+0.049
 set+0.047
 walls+0.042
 cardboard+0.042
Neg logit contributions
ust-0.065
elly-0.050
olding-0.047
umpy-0.047
irk-0.046
Token.
Token ID13
Feature activation-1.00e+00

Pos logit contributions
ust+0.025
irk+0.022
oker+0.021
olding+0.020
rys+0.020
Neg logit contributions
lf-0.019
 then-0.017
 set-0.017
 airplane-0.017
 walls-0.016
Token
Token ID198
Feature activation-1.48e+00

Pos logit contributions
ust+0.035
irk+0.029
oker+0.028
 possible+0.027
 forward+0.027
Neg logit contributions
lf-0.024
 parked-0.020
 then-0.020
 Year-0.020
 cardboard-0.019