TOP ACTIVATIONS
MAX = 0.122

the tree to its once former beauty . He was happy
, leaving it to its former state .
to be restored to its former glory ."
the village back to its former glory !"
back the village to its former state . With the help
was a little girl who badly hurt her foot . She
named Tim went outside to look at the moon . The
J ody went outside to look for butterflies . But there
outside all day so her eyes were very heavy . With
My palm hurts . â
when she went outside to look at the stars , she
girl who didn 't like av oc ados . She always
Tim my went outside to look for some food . He
but when she tried to look at it properly , she
But when she tasted it , she felt weak
eye . He tried to look at it , but he
fresh morning air . His eyes widened in surprise as he
time there was a little av ac ado . It was
bird . It loved to look around and see all the
sky . The grass was freshly cut , and a gentle

MIN ACTIVATIONS
MIN = -0.096

I live at 10 Green Street . Thank you ."
" We live on Green Street , number 12 . Can
who had died a year ago . It was very precious
he had lost a week ago . He wanted to get
it here a long time ago ." " Maybe
can be more fun than competing . They are good brothers
, I love baking for you
lost it a long time ago . I thought I would
best friend a long time ago . He was my dog
, We all wish for lux
Now you have nothing to play with ." Ben
, It â s
that sharing is better than competing . They learn that purple
have no cookie and no play time ." Ben
had lived a long time ago . L
, No , Marcus . It
to prevent the bad guys ( the toys on the floor
, Let â s
it here a long time ago ." They decided
is from a long time ago , when people lived here

INTERVAL 0.068 - 0.122
CONTAINS 0.047%

top , Tom stopped to look around . He could see
loved to play outside and feel the warm sun on her
store , Sam liked to look at all the different things
" Wow , look at that !" Jill said
pasta was ready , it smelled so sweet and delicious that

INTERVAL 0.013 - 0.068
CONTAINS 23.872%

to protect them . <|endoftext|> Once upon a time , there
to play with . When she was done playing , she
gentle push . The girl smiled . She was so happy
known as the biggest and brav est helper in the jungle
" Let 's go find it together . We can choose

INTERVAL -0.041 - 0.013
CONTAINS 74.258%

's mom came into his room and asked him to pause
He was making a hard toy animal . Tim my thought
standing in the middle of the street . Puzz led ,
very happy , but he was also sure to always be
When they got home , Jake gave the puppet a big

INTERVAL -0.096 - -0.041
CONTAINS 1.823%

animals again . Maybe we will see some new animals or
with the match . They played a game of tag ,
was wrong . She said , " I am sorry ,
the sky . It was like a magic show . Tim
eat it , but I agree , it is sm elly
Token the
Token ID262
Feature activation+1.99e-02

Pos logit contributions
ocol+0.041
arter+0.037
elling+0.037
 wolves+0.036
 wolf+0.036
Neg logit contributions
suits-0.029
utes-0.027
J-0.026
oe-0.026
nes-0.025
Token tree
Token ID5509
Feature activation+2.52e-02

Pos logit contributions
ocol+0.046
arter+0.042
elling+0.041
opl+0.037
 wolves+0.037
Neg logit contributions
 wrist-0.028
suits-0.027
oe-0.025
 first-0.025
 stool-0.024
Token to
Token ID284
Feature activation+1.07e-02

Pos logit contributions
ocol+0.065
elling+0.058
 wolves+0.058
arter+0.057
 wolf+0.057
Neg logit contributions
suits-0.031
utes-0.029
nes-0.028
oe-0.028
 wrist-0.028
Token its
Token ID663
Feature activation+3.31e-02

Pos logit contributions
ocol+0.025
elling+0.024
 wolves+0.022
 worries+0.022
 wolf+0.022
Neg logit contributions
pack-0.014
nes-0.014
 wrist-0.014
utes-0.014
J-0.014
Token once
Token ID1752
Feature activation+1.24e-02

Pos logit contributions
ocol+0.076
elling+0.068
arter+0.065
 wolves+0.061
opl+0.061
Neg logit contributions
 wrist-0.052
suits-0.049
utes-0.048
oe-0.045
But-0.045
Token former
Token ID1966
Feature activation+1.22e-01

Pos logit contributions
ocol+0.029
 wolves+0.026
elling+0.026
arter+0.026
 wolf+0.025
Neg logit contributions
suits-0.018
oe-0.017
J-0.016
 wrist-0.016
utes-0.016
Token beauty
Token ID8737
Feature activation+2.48e-02

Pos logit contributions
 wolf+0.202
arter+0.196
ocol+0.188
 wolves+0.187
elling+0.173
Neg logit contributions
suits-0.227
 wrist-0.209
oe-0.207
utes-0.207
 hips-0.200
Token.
Token ID13
Feature activation-2.45e-03

Pos logit contributions
ocol+0.066
arter+0.059
elling+0.059
 hunger+0.059
 wolves+0.058
Neg logit contributions
suits-0.030
utes-0.026
nes-0.026
oe-0.026
 wrist-0.024
Token He
Token ID679
Feature activation-7.67e-03

Pos logit contributions
suits+0.004
utes+0.004
nes+0.004
J+0.004
oe+0.004
Neg logit contributions
ocol-0.005
 wolves-0.005
 wolf-0.004
elling-0.004
arter-0.004
Token was
Token ID373
Feature activation-7.02e-03

Pos logit contributions
suits+0.011
nes+0.010
 wrist+0.010
J+0.010
utes+0.010
Neg logit contributions
ocol-0.019
arter-0.017
elling-0.017
 wolves-0.016
 earthquake-0.016
Token happy
Token ID3772
Feature activation-3.05e-02

Pos logit contributions
nes+0.009
suits+0.009
utes+0.009
oe+0.008
J+0.008
Neg logit contributions
ocol-0.017
 wolves-0.016
elling-0.016
arter-0.016
 wolf-0.015
Token,
Token ID11
Feature activation-3.30e-02

Pos logit contributions
suits+0.035
utes+0.032
nes+0.031
J+0.030
oe+0.030
Neg logit contributions
ocol-0.064
 wolves-0.058
elling-0.057
 wolf-0.057
arter-0.056
Token leaving
Token ID4305
Feature activation+1.20e-02

Pos logit contributions
suits+0.047
utes+0.044
nes+0.042
J+0.041
 wrist+0.041
Neg logit contributions
ocol-0.079
 wolves-0.071
elling-0.070
arter-0.069
 wolf-0.069
Token it
Token ID340
Feature activation+1.72e-02

Pos logit contributions
ocol+0.028
elling+0.025
arter+0.025
opl+0.025
 wolves+0.025
Neg logit contributions
suits-0.017
utes-0.016
J-0.016
oe-0.015
nes-0.015
Token to
Token ID284
Feature activation+7.48e-03

Pos logit contributions
ocol+0.041
 wolves+0.036
elling+0.036
 wolf+0.036
arter+0.035
Neg logit contributions
suits-0.025
oe-0.023
nes-0.023
utes-0.023
 wrist-0.022
Token its
Token ID663
Feature activation+1.65e-02

Pos logit contributions
ocol+0.018
elling+0.017
 wolves+0.015
 worries+0.015
 wolf+0.015
Neg logit contributions
J-0.010
nes-0.009
 wrist-0.009
They-0.009
utes-0.009
Token former
Token ID1966
Feature activation+1.13e-01

Pos logit contributions
ocol+0.036
elling+0.033
opl+0.030
arter+0.029
 wolves+0.029
Neg logit contributions
suits-0.028
utes-0.026
 wrist-0.026
oe-0.025
nes-0.024
Token state
Token ID1181
Feature activation+2.74e-02

Pos logit contributions
 wolf+0.196
 wolves+0.187
ocol+0.179
arter+0.168
elling+0.165
Neg logit contributions
suits-0.230
 wrist-0.204
utes-0.202
oe-0.199
 hips-0.197
Token.
Token ID13
Feature activation-1.34e-02

Pos logit contributions
ocol+0.070
 wolves+0.064
elling+0.064
 wolf+0.063
arter+0.063
Neg logit contributions
suits-0.034
utes-0.031
nes-0.031
J-0.030
oe-0.030
Token
Token ID220
Feature activation+1.60e-02

Pos logit contributions
suits+0.018
utes+0.017
nes+0.017
 wrist+0.016
oe+0.016
Neg logit contributions
ocol-0.031
 wolves-0.028
elling-0.028
 hunger-0.028
 worries-0.028
Token
Token ID198
Feature activation-3.18e-03

Pos logit contributions
ocol+0.038
opl+0.035
 wolves+0.035
elling+0.035
 destruction+0.034
Neg logit contributions
suits-0.021
 wrist-0.018
utes-0.018
nes-0.018
J-0.017
Token
Token ID198
Feature activation+3.20e-03

Pos logit contributions
suits+0.006
suit+0.005
 wrist+0.005
nes+0.005
utes+0.005
Neg logit contributions
ocol-0.006
opl-0.005
 wolves-0.005
 destruction-0.005
 wolf-0.005
Token to
Token ID284
Feature activation-4.70e-03

Pos logit contributions
suits+0.016
utes+0.015
J+0.015
nes+0.015
 wrist+0.014
Neg logit contributions
ocol-0.026
 wolves-0.024
elling-0.024
 wolf-0.023
 worries-0.023
Token be
Token ID307
Feature activation-7.29e-03

Pos logit contributions
suits+0.006
nes+0.006
utes+0.006
 wrist+0.006
pack+0.006
Neg logit contributions
ocol-0.011
elling-0.010
 wolves-0.010
arter-0.010
 wolf-0.010
Token restored
Token ID15032
Feature activation-2.16e-02

Pos logit contributions
suits+0.012
utes+0.010
nes+0.010
oe+0.010
pack+0.010
Neg logit contributions
ocol-0.017
arter-0.015
elling-0.015
 wolves-0.015
 wolf-0.015
Token to
Token ID284
Feature activation+2.77e-03

Pos logit contributions
suits+0.029
utes+0.027
nes+0.026
oe+0.025
J+0.025
Neg logit contributions
ocol-0.053
 wolves-0.048
arter-0.047
elling-0.047
 wolf-0.047
Token its
Token ID663
Feature activation+2.33e-02

Pos logit contributions
ocol+0.007
elling+0.006
 wolf+0.006
 wolves+0.006
arter+0.006
Neg logit contributions
nes-0.003
 wrist-0.003
pack-0.003
utes-0.003
suits-0.003
Token former
Token ID1966
Feature activation+1.12e-01

Pos logit contributions
ocol+0.048
elling+0.043
arter+0.041
opl+0.039
 wolves+0.038
Neg logit contributions
suits-0.041
 wrist-0.039
utes-0.039
oe-0.037
nes-0.036
Token glory
Token ID13476
Feature activation+4.92e-03

Pos logit contributions
 wolf+0.181
ocol+0.175
 wolves+0.171
arter+0.162
elling+0.154
Neg logit contributions
suits-0.231
 wrist-0.214
utes-0.212
oe-0.203
 hips-0.203
Token."
Token ID526
Feature activation-1.09e-02

Pos logit contributions
ocol+0.012
 wolves+0.011
elling+0.010
 wolf+0.010
 worries+0.010
Neg logit contributions
suits-0.007
utes-0.007
nes-0.006
J-0.006
oe-0.006
Token
Token ID220
Feature activation+1.11e-02

Pos logit contributions
suits+0.014
utes+0.013
nes+0.013
 wrist+0.012
pack+0.012
Neg logit contributions
ocol-0.027
elling-0.024
 worries-0.024
 wolves-0.024
 wolf-0.024
Token
Token ID198
Feature activation-6.06e-03

Pos logit contributions
ocol+0.028
 wolves+0.025
elling+0.025
 wolf+0.025
arter+0.025
Neg logit contributions
suits-0.014
utes-0.012
 wrist-0.012
nes-0.012
J-0.011
Token
Token ID198
Feature activation+1.09e-03

Pos logit contributions
suits+0.011
 wrist+0.010
suit+0.009
nes+0.009
utes+0.009
Neg logit contributions
ocol-0.011
elling-0.010
 wolf-0.010
 wolves-0.010
arter-0.010
Token the
Token ID262
Feature activation-5.92e-03

Pos logit contributions
ocol+0.026
arter+0.025
 wolves+0.024
 wolf+0.024
elling+0.023
Neg logit contributions
suits-0.012
utes-0.012
oe-0.011
nes-0.011
 wrist-0.011
Token village
Token ID7404
Feature activation-9.63e-03

Pos logit contributions
suits+0.011
 wrist+0.010
utes+0.010
oe+0.010
J+0.010
Neg logit contributions
ocol-0.012
arter-0.010
elling-0.010
opl-0.010
 wolves-0.010
Token back
Token ID736
Feature activation+2.77e-02

Pos logit contributions
suits+0.013
utes+0.012
nes+0.012
oe+0.012
 wrist+0.012
Neg logit contributions
ocol-0.023
 wolves-0.021
arter-0.021
 wolf-0.021
elling-0.021
Token to
Token ID284
Feature activation+1.50e-04

Pos logit contributions
ocol+0.071
arter+0.064
 wolf+0.063
 wolves+0.063
elling+0.063
Neg logit contributions
suits-0.035
utes-0.032
nes-0.031
oe-0.030
 wrist-0.029
Token its
Token ID663
Feature activation+1.45e-02

Pos logit contributions
ocol+0.000
elling+0.000
 wolves+0.000
 wolf+0.000
 grizz+0.000
Neg logit contributions
utes-0.000
nes-0.000
They-0.000
 wrist-0.000
But-0.000
Token former
Token ID1966
Feature activation+1.04e-01

Pos logit contributions
ocol+0.030
elling+0.027
arter+0.025
opl+0.025
 wolves+0.024
Neg logit contributions
suits-0.025
 wrist-0.025
utes-0.024
oe-0.023
nes-0.023
Token glory
Token ID13476
Feature activation+1.98e-03

Pos logit contributions
 wolf+0.166
 wolves+0.159
ocol+0.157
arter+0.150
vious+0.138
Neg logit contributions
suits-0.218
 wrist-0.205
oe-0.198
utes-0.197
j-0.189
Token!"
Token ID2474
Feature activation-3.10e-02

Pos logit contributions
ocol+0.005
 wolves+0.004
 wolf+0.004
elling+0.004
arter+0.004
Neg logit contributions
suits-0.003
utes-0.003
nes-0.003
J-0.002
oe-0.002
Token
Token ID220
Feature activation+2.52e-03

Pos logit contributions
suits+0.046
utes+0.043
nes+0.041
oe+0.039
J+0.038
Neg logit contributions
ocol-0.071
elling-0.065
 wolves-0.065
 wolf-0.064
 worries-0.064
Token
Token ID198
Feature activation-2.18e-02

Pos logit contributions
ocol+0.006
elling+0.006
opl+0.006
 wolves+0.006
arter+0.006
Neg logit contributions
suits-0.003
utes-0.003
 wrist-0.003
J-0.003
nes-0.003
Token
Token ID198
Feature activation-1.40e-02

Pos logit contributions
suits+0.037
suit+0.034
utes+0.031
nes+0.031
 wrist+0.031
Neg logit contributions
ocol-0.042
opl-0.039
 wolf-0.039
 destruction-0.039
 wolves-0.038
Token back
Token ID736
Feature activation+3.61e-02

Pos logit contributions
suits+0.011
utes+0.011
J+0.011
oe+0.010
nes+0.010
Neg logit contributions
ocol-0.016
arter-0.015
 wolves-0.014
 wolf-0.014
elling-0.014
Token the
Token ID262
Feature activation-6.45e-03

Pos logit contributions
ocol+0.091
arter+0.086
opl+0.082
elling+0.082
 earthquake+0.081
Neg logit contributions
suits-0.048
utes-0.044
J-0.042
oe-0.041
nes-0.040
Token village
Token ID7404
Feature activation-7.54e-03

Pos logit contributions
suits+0.013
utes+0.012
nes+0.012
J+0.012
oe+0.011
Neg logit contributions
ocol-0.012
 wolves-0.011
arter-0.010
elling-0.010
 wolf-0.010
Token to
Token ID284
Feature activation+2.56e-03

Pos logit contributions
suits+0.011
utes+0.010
nes+0.010
J+0.009
oe+0.009
Neg logit contributions
ocol-0.018
 wolves-0.016
elling-0.016
 wolf-0.016
arter-0.016
Token its
Token ID663
Feature activation+1.63e-02

Pos logit contributions
ocol+0.006
elling+0.006
 wolves+0.006
 wolf+0.005
 worries+0.005
Neg logit contributions
utes-0.003
J-0.003
 wrist-0.003
nes-0.003
suits-0.003
Token former
Token ID1966
Feature activation+1.03e-01

Pos logit contributions
ocol+0.035
elling+0.032
arter+0.030
opl+0.030
vious+0.029
Neg logit contributions
 wrist-0.027
suits-0.026
utes-0.025
oe-0.025
nes-0.024
Token state
Token ID1181
Feature activation+2.26e-02

Pos logit contributions
 wolf+0.164
 wolves+0.158
ocol+0.154
arter+0.149
vious+0.137
Neg logit contributions
suits-0.217
 wrist-0.205
oe-0.197
utes-0.197
 hips-0.189
Token.
Token ID13
Feature activation-1.41e-02

Pos logit contributions
ocol+0.056
 wolves+0.051
 wolf+0.051
elling+0.051
arter+0.050
Neg logit contributions
suits-0.030
utes-0.028
nes-0.027
J-0.026
oe-0.026
Token With
Token ID2080
Feature activation-3.10e-03

Pos logit contributions
suits+0.015
utes+0.014
nes+0.013
J+0.013
oe+0.012
Neg logit contributions
ocol-0.038
 wolves-0.035
elling-0.035
 wolf-0.035
 hunger-0.035
Token the
Token ID262
Feature activation+7.67e-03

Pos logit contributions
suits+0.004
utes+0.004
J+0.004
 wrist+0.004
They+0.004
Neg logit contributions
ocol-0.007
arter-0.007
elling-0.007
 wolves-0.007
 wolf-0.007
Token help
Token ID1037
Feature activation+4.14e-03

Pos logit contributions
ocol+0.015
 wolves+0.014
arter+0.014
elling+0.014
 wolf+0.013
Neg logit contributions
suits-0.014
utes-0.013
J-0.013
oe-0.012
 wrist-0.012
Token was
Token ID373
Feature activation+1.89e-02

Pos logit contributions
ocol+0.030
elling+0.026
 wolves+0.026
arter+0.026
 wolf+0.025
Neg logit contributions
suits-0.019
J-0.018
utes-0.018
nes-0.017
oe-0.017
Token a
Token ID257
Feature activation-3.26e-03

Pos logit contributions
ocol+0.039
 wolves+0.036
 wolf+0.036
 hunger+0.036
 worries+0.034
Neg logit contributions
suits-0.033
utes-0.031
nes-0.030
 wrist-0.029
oe-0.029
Token little
Token ID1310
Feature activation+8.64e-03

Pos logit contributions
suits+0.007
nes+0.006
utes+0.006
 wrist+0.006
z+0.006
Neg logit contributions
ocol-0.005
 wolf-0.005
 wolves-0.005
arter-0.005
 hunger-0.005
Token girl
Token ID2576
Feature activation+1.99e-02

Pos logit contributions
ocol+0.018
 wolves+0.016
elling+0.016
 worries+0.015
 wolf+0.015
Neg logit contributions
suits-0.015
utes-0.014
nes-0.014
J-0.014
oe-0.013
Token who
Token ID508
Feature activation+2.22e-02

Pos logit contributions
ocol+0.030
 wolf+0.026
 wolves+0.026
arter+0.025
elling+0.025
Neg logit contributions
suits-0.045
utes-0.044
nes-0.043
J-0.040
oe-0.040
Token badly
Token ID11234
Feature activation+8.98e-02

Pos logit contributions
ocol+0.048
 wolves+0.042
elling+0.042
arter+0.042
 wolf+0.041
Neg logit contributions
suits-0.037
utes-0.034
nes-0.034
J-0.034
oe-0.033
Token hurt
Token ID5938
Feature activation+4.09e-02

Pos logit contributions
ocol+0.209
elling+0.185
 wolves+0.183
arter+0.181
 hunger+0.178
Neg logit contributions
suits-0.141
utes-0.132
oe-0.126
nes-0.124
 wrist-0.123
Token her
Token ID607
Feature activation+2.54e-02

Pos logit contributions
ocol+0.105
arter+0.092
elling+0.092
opl+0.091
 worries+0.091
Neg logit contributions
suits-0.052
 wrist-0.050
nes-0.049
J-0.049
utes-0.048
Token foot
Token ID2366
Feature activation+3.52e-02

Pos logit contributions
ocol+0.057
elling+0.051
 wolves+0.050
arter+0.049
 wolf+0.049
Neg logit contributions
suits-0.040
utes-0.039
 wrist-0.037
nes-0.037
oe-0.035
Token.
Token ID13
Feature activation+4.27e-02

Pos logit contributions
ocol+0.081
 wolves+0.075
 wolf+0.074
elling+0.073
 worries+0.073
Neg logit contributions
suits-0.051
utes-0.049
nes-0.046
J-0.046
oe-0.044
Token She
Token ID1375
Feature activation+3.17e-02

Pos logit contributions
ocol+0.084
 wolves+0.074
elling+0.073
 wolf+0.073
 worries+0.072
Neg logit contributions
suits-0.079
utes-0.075
nes-0.073
J-0.072
oe-0.070
Token named
Token ID3706
Feature activation+1.52e-02

Pos logit contributions
ocol+0.023
 wolves+0.020
elling+0.020
 worries+0.020
 wolf+0.019
Neg logit contributions
suits-0.018
J-0.017
utes-0.017
oe-0.016
nes-0.016
Token Tim
Token ID5045
Feature activation+8.83e-03

Pos logit contributions
ocol+0.023
 wolves+0.020
elling+0.020
 wolf+0.020
arter+0.019
Neg logit contributions
suits-0.035
utes-0.034
nes-0.033
J-0.032
 wrist-0.032
Token went
Token ID1816
Feature activation+1.43e-02

Pos logit contributions
ocol+0.019
 wolves+0.016
elling+0.016
arter+0.016
 worries+0.016
Neg logit contributions
suits-0.016
J-0.015
utes-0.014
nes-0.014
oe-0.014
Token outside
Token ID2354
Feature activation-6.43e-03

Pos logit contributions
ocol+0.033
 wolves+0.030
 wolf+0.030
elling+0.029
arter+0.029
Neg logit contributions
suits-0.021
utes-0.020
nes-0.019
J-0.019
 wrist-0.019
Token to
Token ID284
Feature activation+3.65e-02

Pos logit contributions
suits+0.010
utes+0.009
nes+0.009
J+0.009
 wrist+0.008
Neg logit contributions
ocol-0.015
 wolves-0.014
 wolf-0.013
 hunger-0.013
arter-0.013
Token look
Token ID804
Feature activation+8.96e-02

Pos logit contributions
ocol+0.098
elling+0.089
 wolves+0.088
 worries+0.087
arter+0.087
Neg logit contributions
suits-0.040
utes-0.038
 wrist-0.036
nes-0.036
J-0.036
Token at
Token ID379
Feature activation+4.03e-02

Pos logit contributions
ocol+0.189
opl+0.179
 destruction+0.177
 worries+0.174
 wolves+0.174
Neg logit contributions
suits-0.144
utes-0.136
oe-0.124
J-0.118
pack-0.116
Token the
Token ID262
Feature activation+3.56e-02

Pos logit contributions
ocol+0.109
arter+0.101
elling+0.097
opl+0.093
 worries+0.092
Neg logit contributions
utes-0.040
 wrist-0.039
suits-0.038
J-0.037
Mom-0.036
Token moon
Token ID8824
Feature activation-1.51e-02

Pos logit contributions
ocol+0.077
arter+0.067
 worries+0.067
 hunger+0.067
elling+0.065
Neg logit contributions
suits-0.060
utes-0.059
J-0.054
Mom-0.053
oe-0.053
Token.
Token ID13
Feature activation+4.83e-02

Pos logit contributions
suits+0.025
nes+0.023
utes+0.023
J+0.023
oe+0.021
Neg logit contributions
ocol-0.033
 wolves-0.030
 wolf-0.029
elling-0.029
arter-0.029
Token The
Token ID383
Feature activation+3.12e-02

Pos logit contributions
ocol+0.096
 wolves+0.084
 worries+0.083
elling+0.082
 wolf+0.081
Neg logit contributions
suits-0.090
utes-0.085
 wrist-0.080
oe-0.080
J-0.080
Token J
Token ID449
Feature activation+3.64e-03

Pos logit contributions
suits+0.042
nes+0.041
utes+0.039
 wrist+0.038
pack+0.037
Neg logit contributions
ocol-0.043
 worries-0.036
 wolf-0.036
elling-0.036
 wolves-0.035
Tokenody
Token ID1118
Feature activation+3.24e-02

Pos logit contributions
ocol+0.007
elling+0.006
opl+0.006
arter+0.006
 worries+0.006
Neg logit contributions
suits-0.007
utes-0.006
nes-0.006
z-0.006
They-0.006
Token went
Token ID1816
Feature activation+1.74e-02

Pos logit contributions
ocol+0.068
 wolves+0.061
elling+0.060
 wolf+0.060
arter+0.059
Neg logit contributions
suits-0.055
utes-0.052
nes-0.051
J-0.050
oe-0.049
Token outside
Token ID2354
Feature activation-1.15e-02

Pos logit contributions
ocol+0.044
 wolves+0.040
elling+0.039
 worries+0.039
 hunger+0.039
Neg logit contributions
suits-0.023
utes-0.021
nes-0.020
J-0.020
oe-0.020
Token to
Token ID284
Feature activation+3.60e-02

Pos logit contributions
suits+0.015
utes+0.014
nes+0.013
J+0.013
oe+0.012
Neg logit contributions
ocol-0.029
 wolves-0.027
elling-0.026
 wolf-0.026
 worries-0.026
Token look
Token ID804
Feature activation+8.83e-02

Pos logit contributions
ocol+0.103
 wolves+0.093
elling+0.092
 worries+0.092
opl+0.091
Neg logit contributions
 wrist-0.031
pack-0.031
suits-0.031
J-0.031
utes-0.030
Token for
Token ID329
Feature activation+3.92e-02

Pos logit contributions
ocol+0.192
opl+0.181
arter+0.170
 wolves+0.168
 destruction+0.168
Neg logit contributions
suits-0.146
utes-0.133
oe-0.126
J-0.125
But-0.117
Token butterflies
Token ID44571
Feature activation+1.23e-02

Pos logit contributions
opl+0.101
elling+0.100
arter+0.098
ocol+0.098
 shouldn+0.092
Neg logit contributions
J-0.049
suits-0.037
 wrist-0.037
utes-0.034
 board-0.034
Token.
Token ID13
Feature activation+2.95e-02

Pos logit contributions
ocol+0.028
 wolves+0.026
 wolf+0.026
 worries+0.025
elling+0.025
Neg logit contributions
suits-0.018
utes-0.017
nes-0.016
J-0.016
oe-0.015
Token But
Token ID887
Feature activation+3.08e-02

Pos logit contributions
ocol+0.050
 wolves+0.045
elling+0.044
 wolf+0.044
 worries+0.043
Neg logit contributions
suits-0.060
utes-0.058
nes-0.056
oe-0.054
J-0.054
Token there
Token ID612
Feature activation-3.05e-03

Pos logit contributions
ocol+0.076
elling+0.069
 wolves+0.069
arter+0.068
 wolf+0.067
Neg logit contributions
suits-0.040
utes-0.039
nes-0.038
J-0.037
 wrist-0.036
Token outside
Token ID2354
Feature activation-1.92e-03

Pos logit contributions
suits+0.050
utes+0.044
oe+0.043
J+0.042
nes+0.042
Neg logit contributions
 wolves-0.072
ocol-0.072
 wolf-0.071
 hunger-0.066
 ow-0.066
Token all
Token ID477
Feature activation+5.23e-02

Pos logit contributions
suits+0.003
utes+0.002
nes+0.002
J+0.002
oe+0.002
Neg logit contributions
ocol-0.005
 wolves-0.004
 wolf-0.004
elling-0.004
 worries-0.004
Token day
Token ID1110
Feature activation-2.80e-02

Pos logit contributions
ocol+0.118
 wolves+0.104
 worries+0.104
 hunger+0.104
 wolf+0.104
Neg logit contributions
suits-0.080
utes-0.077
oe-0.076
nes-0.075
J-0.075
Token so
Token ID523
Feature activation+4.32e-02

Pos logit contributions
suits+0.049
utes+0.047
J+0.046
nes+0.044
Gl+0.044
Neg logit contributions
ocol-0.049
 wolf-0.047
 wolves-0.047
 earthquake-0.046
 hunger-0.046
Token her
Token ID607
Feature activation+3.30e-02

Pos logit contributions
ocol+0.091
elling+0.081
 wolves+0.080
arter+0.079
 wolf+0.079
Neg logit contributions
suits-0.073
utes-0.071
J-0.069
nes-0.068
oe-0.066
Token eyes
Token ID2951
Feature activation+8.79e-02

Pos logit contributions
ocol+0.072
 wolves+0.064
elling+0.063
 wolf+0.063
arter+0.063
Neg logit contributions
suits-0.054
utes-0.052
nes-0.050
J-0.048
oe-0.048
Token were
Token ID547
Feature activation+1.67e-02

Pos logit contributions
ocol+0.196
arter+0.170
 wolves+0.168
elling+0.168
 wolf+0.164
Neg logit contributions
suits-0.148
nes-0.132
utes-0.132
oe-0.129
J-0.128
Token very
Token ID845
Feature activation+4.11e-02

Pos logit contributions
ocol+0.037
elling+0.032
 wolves+0.032
arter+0.032
 worries+0.031
Neg logit contributions
suits-0.027
utes-0.026
nes-0.025
J-0.025
oe-0.024
Token heavy
Token ID4334
Feature activation+1.19e-02

Pos logit contributions
ocol+0.093
elling+0.083
opl+0.083
 wolves+0.082
 wolf+0.080
Neg logit contributions
suits-0.065
utes-0.059
oe-0.059
nes-0.057
J-0.056
Token.
Token ID13
Feature activation+2.73e-02

Pos logit contributions
ocol+0.034
arter+0.031
elling+0.031
opl+0.030
 hunger+0.030
Neg logit contributions
suits-0.012
utes-0.010
nes-0.009
 wrist-0.009
oe-0.009
Token With
Token ID2080
Feature activation+1.28e-02

Pos logit contributions
ocol+0.051
 wolves+0.046
elling+0.045
 wolf+0.045
 hunger+0.044
Neg logit contributions
suits-0.053
utes-0.050
nes-0.049
J-0.047
oe-0.046
Token
Token ID95
Feature activation-2.64e-02

Pos logit contributions
suits+0.014
utes+0.013
J+0.012
nes+0.012
They+0.011
Neg logit contributions
ocol-0.017
arter-0.015
elling-0.015
 wolf-0.015
opl-0.014
Token
Token ID26391
Feature activation-1.94e-02

Pos logit contributions
suits+0.046
utes+0.044
J+0.039
oe+0.039
nes+0.039
Neg logit contributions
ocol-0.054
elling-0.050
opl-0.049
arter-0.048
 wolf-0.047
Token
Token ID129
Feature activation+2.51e-03

Pos logit contributions
utes+0.032
suits+0.030
nes+0.029
pack+0.027
oe+0.027
Neg logit contributions
opl-0.035
ocol-0.034
 wolf-0.034
 shouldn-0.033
arter-0.033
Token
Token ID241
Feature activation-6.14e-02

Pos logit contributions
ocol+0.005
 wolves+0.004
elling+0.004
 wolf+0.004
arter+0.004
Neg logit contributions
suits-0.005
utes-0.005
nes-0.004
J-0.004
oe-0.004
TokenMy
Token ID3666
Feature activation+2.32e-02

Pos logit contributions
suits+0.085
utes+0.079
nes+0.076
pack+0.074
 wrist+0.073
Neg logit contributions
ocol-0.136
 wolf-0.127
 wolves-0.126
 shouldn-0.123
opl-0.121
Token palm
Token ID18057
Feature activation+8.73e-02

Pos logit contributions
ocol+0.063
elling+0.057
 wolves+0.057
opl+0.055
 destruction+0.055
Neg logit contributions
suits-0.026
utes-0.024
 wrist-0.023
oe-0.022
nes-0.021
Token hurts
Token ID20406
Feature activation+5.33e-02

Pos logit contributions
ocol+0.195
arter+0.172
 hunger+0.169
 wolves+0.167
 destruction+0.164
Neg logit contributions
suits-0.131
 wrist-0.130
oe-0.127
utes-0.124
Anna-0.119
Token.
Token ID13
Feature activation+1.15e-02

Pos logit contributions
ocol+0.140
arter+0.122
 hunger+0.121
elling+0.120
 worries+0.119
Neg logit contributions
suits-0.065
They-0.061
nes-0.059
utes-0.059
J-0.058
Tokenâ
Token ID22940
Feature activation+5.25e-03

Pos logit contributions
ocol+0.022
 wolves+0.019
 wolf+0.019
elling+0.019
 worries+0.019
Neg logit contributions
suits-0.022
utes-0.021
nes-0.020
J-0.020
oe-0.019
Token
Token ID26391
Feature activation-2.57e-02

Pos logit contributions
ocol+0.011
elling+0.010
opl+0.010
arter+0.010
 wolves+0.010
Neg logit contributions
suits-0.009
utes-0.009
nes-0.008
J-0.008
 wrist-0.008
Token
Token ID220
Feature activation+1.83e-02

Pos logit contributions
utes+0.041
suits+0.039
nes+0.037
oe+0.036
J+0.036
Neg logit contributions
ocol-0.050
opl-0.049
 worries-0.047
 wolves-0.047
 wolf-0.047
Token when
Token ID618
Feature activation-3.28e-03

Pos logit contributions
ocol+0.004
elling+0.003
arter+0.003
 wolves+0.003
opl+0.003
Neg logit contributions
suits-0.002
utes-0.002
J-0.002
nes-0.002
 wrist-0.002
Token she
Token ID673
Feature activation+3.86e-02

Pos logit contributions
suits+0.004
utes+0.004
J+0.004
oe+0.004
nes+0.004
Neg logit contributions
ocol-0.008
elling-0.007
 wolves-0.007
arter-0.007
 wolf-0.007
Token went
Token ID1816
Feature activation+1.59e-02

Pos logit contributions
ocol+0.085
 wolves+0.075
arter+0.075
elling+0.075
opl+0.074
Neg logit contributions
suits-0.063
J-0.059
oe-0.058
nes-0.058
Anna-0.058
Token outside
Token ID2354
Feature activation-1.52e-02

Pos logit contributions
ocol+0.045
elling+0.040
 worries+0.040
 hunger+0.039
arter+0.039
Neg logit contributions
suits-0.016
utes-0.015
oe-0.015
nes-0.014
J-0.013
Token to
Token ID284
Feature activation+2.33e-02

Pos logit contributions
suits+0.017
utes+0.016
nes+0.015
oe+0.015
J+0.015
Neg logit contributions
ocol-0.041
elling-0.037
 wolves-0.037
arter-0.036
 wolf-0.036
Token look
Token ID804
Feature activation+8.68e-02

Pos logit contributions
ocol+0.059
elling+0.054
 wolves+0.053
arter+0.051
 destruction+0.051
Neg logit contributions
 wrist-0.028
suits-0.027
nes-0.027
utes-0.026
pack-0.026
Token at
Token ID379
Feature activation+3.47e-02

Pos logit contributions
ocol+0.213
opl+0.202
arter+0.194
 destruction+0.193
 wolves+0.190
Neg logit contributions
suits-0.116
utes-0.100
oe-0.099
J-0.096
Anna-0.092
Token the
Token ID262
Feature activation+2.27e-02

Pos logit contributions
ocol+0.089
arter+0.079
elling+0.077
 wolves+0.076
 wolf+0.075
Neg logit contributions
utes-0.041
 wrist-0.040
suits-0.037
oe-0.036
J-0.036
Token stars
Token ID5788
Feature activation-7.26e-03

Pos logit contributions
ocol+0.047
arter+0.042
 hunger+0.040
 wolves+0.039
elling+0.039
Neg logit contributions
utes-0.039
suits-0.039
oe-0.035
J-0.035
 wrist-0.035
Token,
Token ID11
Feature activation+2.06e-02

Pos logit contributions
suits+0.010
utes+0.009
nes+0.009
J+0.008
oe+0.008
Neg logit contributions
ocol-0.018
 wolves-0.016
elling-0.016
 wolf-0.016
 worries-0.016
Token she
Token ID673
Feature activation+2.52e-02

Pos logit contributions
ocol+0.049
elling+0.044
 wolves+0.043
arter+0.043
 wolf+0.043
Neg logit contributions
suits-0.030
utes-0.028
J-0.027
nes-0.027
 wrist-0.026
Token girl
Token ID2576
Feature activation+2.12e-02

Pos logit contributions
ocol+0.020
elling+0.018
 wolves+0.017
opl+0.017
 worries+0.017
Neg logit contributions
suits-0.015
utes-0.013
oe-0.013
J-0.013
 wrist-0.012
Token who
Token ID508
Feature activation+2.14e-02

Pos logit contributions
ocol+0.034
 wolves+0.030
 wolf+0.030
arter+0.029
elling+0.029
Neg logit contributions
suits-0.046
utes-0.044
nes-0.043
J-0.041
oe-0.041
Token didn
Token ID1422
Feature activation+5.14e-03

Pos logit contributions
ocol+0.049
elling+0.044
 wolves+0.043
arter+0.043
 wolf+0.042
Neg logit contributions
suits-0.034
J-0.031
nes-0.031
oe-0.030
utes-0.030
Token't
Token ID470
Feature activation+6.31e-02

Pos logit contributions
ocol+0.013
 wolves+0.012
elling+0.012
 wolf+0.012
arter+0.012
Neg logit contributions
suits-0.007
utes-0.005
 wrist-0.005
J-0.005
nes-0.005
Token like
Token ID588
Feature activation-3.34e-02

Pos logit contributions
ocol+0.137
elling+0.123
 wolves+0.121
 worries+0.119
 wolf+0.119
Neg logit contributions
suits-0.100
nes-0.099
J-0.096
utes-0.095
ana-0.092
Token av
Token ID1196
Feature activation+8.67e-02

Pos logit contributions
suits+0.040
utes+0.039
nes+0.037
 wrist+0.037
J+0.037
Neg logit contributions
ocol-0.084
elling-0.077
arter-0.074
opl-0.073
 wolves-0.072
Tokenoc
Token ID420
Feature activation+2.30e-02

Pos logit contributions
ocol+0.105
 wolves+0.087
arter+0.087
elling+0.083
 wolf+0.081
Neg logit contributions
suits-0.219
z-0.206
utes-0.204
J-0.202
 wrist-0.201
Tokenados
Token ID22484
Feature activation+4.62e-02

Pos logit contributions
 wolves+0.034
ocol+0.034
arter+0.033
 wolf+0.031
 nightmare+0.031
Neg logit contributions
 wrist-0.043
oe-0.042
z-0.041
utes-0.041
suits-0.039
Token.
Token ID13
Feature activation+4.03e-02

Pos logit contributions
ocol+0.122
elling+0.111
 wolves+0.109
arter+0.108
 hunger+0.107
Neg logit contributions
suits-0.054
utes-0.050
nes-0.049
 wrist-0.046
J-0.046
Token She
Token ID1375
Feature activation+2.68e-02

Pos logit contributions
ocol+0.071
 wolves+0.063
elling+0.063
 wolf+0.062
 hunger+0.061
Neg logit contributions
suits-0.081
utes-0.079
nes-0.076
J-0.073
oe-0.072
Token always
Token ID1464
Feature activation+4.46e-03

Pos logit contributions
ocol+0.063
arter+0.057
elling+0.056
 wolves+0.055
opl+0.054
Neg logit contributions
suits-0.041
nes-0.039
J-0.037
utes-0.037
oe-0.036
Token Tim
Token ID5045
Feature activation+5.20e-03

Pos logit contributions
ocol+0.002
elling+0.001
 wolves+0.001
arter+0.001
opl+0.001
Neg logit contributions
suits-0.002
utes-0.001
J-0.001
oe-0.001
They-0.001
Tokenmy
Token ID1820
Feature activation+8.42e-03

Pos logit contributions
ocol+0.011
 wolves+0.009
elling+0.009
arter+0.009
 hunger+0.009
Neg logit contributions
suits-0.010
utes-0.009
J-0.009
nes-0.009
oe-0.009
Token went
Token ID1816
Feature activation+1.13e-02

Pos logit contributions
ocol+0.021
elling+0.018
arter+0.018
 wolves+0.017
opl+0.017
Neg logit contributions
suits-0.013
J-0.011
utes-0.011
nes-0.011
oe-0.011
Token outside
Token ID2354
Feature activation-1.19e-02

Pos logit contributions
ocol+0.029
elling+0.026
 wolves+0.026
 worries+0.026
 hunger+0.026
Neg logit contributions
suits-0.014
utes-0.013
nes-0.013
oe-0.012
J-0.012
Token to
Token ID284
Feature activation+3.14e-02

Pos logit contributions
suits+0.016
utes+0.015
nes+0.014
J+0.014
oe+0.013
Neg logit contributions
ocol-0.029
 wolves-0.027
 wolf-0.026
elling-0.026
 worries-0.026
Token look
Token ID804
Feature activation+8.65e-02

Pos logit contributions
ocol+0.093
elling+0.085
opl+0.083
 wolves+0.082
 worries+0.082
Neg logit contributions
pack-0.026
 wrist-0.025
suits-0.024
nes-0.023
J-0.022
Token for
Token ID329
Feature activation+4.27e-02

Pos logit contributions
ocol+0.190
opl+0.180
arter+0.167
 destruction+0.165
 worries+0.164
Neg logit contributions
suits-0.139
utes-0.122
 together-0.117
oe-0.115
But-0.114
Token some
Token ID617
Feature activation+2.51e-02

Pos logit contributions
opl+0.111
elling+0.108
ocol+0.102
arter+0.099
 shouldn+0.099
Neg logit contributions
J-0.046
 fast-0.040
 board-0.040
suits-0.039
 with-0.039
Token food
Token ID2057
Feature activation+1.80e-02

Pos logit contributions
opl+0.049
ocol+0.048
arter+0.045
elling+0.044
 grizz+0.041
Neg logit contributions
suits-0.044
utes-0.037
They-0.035
 fast-0.034
 shirts-0.034
Token.
Token ID13
Feature activation+2.99e-02

Pos logit contributions
ocol+0.045
elling+0.040
 wolves+0.040
 worries+0.040
arter+0.040
Neg logit contributions
suits-0.024
utes-0.022
nes-0.021
J-0.021
oe-0.020
Token He
Token ID679
Feature activation+4.02e-03

Pos logit contributions
ocol+0.053
 wolves+0.047
 wolf+0.046
elling+0.046
 worries+0.045
Neg logit contributions
suits-0.061
utes-0.059
nes-0.057
J-0.056
oe-0.055
Token but
Token ID475
Feature activation+3.52e-02

Pos logit contributions
suits+0.013
utes+0.013
nes+0.012
J+0.012
oe+0.011
Neg logit contributions
ocol-0.022
 wolves-0.020
 wolf-0.020
 worries-0.020
elling-0.020
Token when
Token ID618
Feature activation-7.75e-03

Pos logit contributions
ocol+0.082
 wolves+0.074
elling+0.073
arter+0.073
 wolf+0.072
Neg logit contributions
suits-0.052
utes-0.050
nes-0.049
J-0.048
 wrist-0.047
Token she
Token ID673
Feature activation+3.57e-02

Pos logit contributions
suits+0.013
utes+0.012
nes+0.012
J+0.011
oe+0.011
Neg logit contributions
ocol-0.017
 wolves-0.015
 wolf-0.015
 worries-0.015
elling-0.015
Token tried
Token ID3088
Feature activation+1.08e-02

Pos logit contributions
ocol+0.089
 wolves+0.084
opl+0.084
arter+0.082
elling+0.081
Neg logit contributions
suits-0.047
J-0.045
oe-0.044
nes-0.042
 wrist-0.041
Token to
Token ID284
Feature activation+1.45e-02

Pos logit contributions
ocol+0.027
 wolves+0.025
elling+0.025
arter+0.025
 wolf+0.025
Neg logit contributions
suits-0.014
utes-0.013
nes-0.012
J-0.012
oe-0.012
Token look
Token ID804
Feature activation+8.65e-02

Pos logit contributions
ocol+0.031
 wolf+0.028
 hunger+0.028
 worries+0.028
 wolves+0.028
Neg logit contributions
suits-0.024
utes-0.023
J-0.021
nes-0.021
oe-0.020
Token at
Token ID379
Feature activation+3.06e-02

Pos logit contributions
ocol+0.199
 wolves+0.182
arter+0.180
opl+0.178
 hunger+0.176
Neg logit contributions
suits-0.130
utes-0.118
oe-0.116
J-0.116
nes-0.115
Token it
Token ID340
Feature activation+3.93e-02

Pos logit contributions
ocol+0.081
arter+0.076
 wolves+0.075
 wolf+0.073
opl+0.072
Neg logit contributions
suits-0.030
utes-0.030
oe-0.029
 wrist-0.029
J-0.028
Token properly
Token ID6105
Feature activation+4.53e-02

Pos logit contributions
ocol+0.097
 wolves+0.087
elling+0.086
arter+0.086
 wolf+0.085
Neg logit contributions
suits-0.052
oe-0.048
utes-0.048
nes-0.047
 wrist-0.046
Token,
Token ID11
Feature activation+2.06e-03

Pos logit contributions
ocol+0.095
 wolves+0.084
 wolf+0.084
 worries+0.083
 hunger+0.082
Neg logit contributions
suits-0.078
utes-0.075
nes-0.071
J-0.070
 wrist-0.068
Token she
Token ID673
Feature activation+2.11e-02

Pos logit contributions
ocol+0.005
 wolves+0.004
elling+0.004
 wolf+0.004
arter+0.004
Neg logit contributions
suits-0.003
utes-0.003
nes-0.003
J-0.003
 wrist-0.003
Token
Token ID198
Feature activation-3.75e-03

Pos logit contributions
ocol+0.030
opl+0.028
elling+0.028
arter+0.028
 wolves+0.027
Neg logit contributions
suits-0.023
 wrist-0.020
suit-0.020
nes-0.019
utes-0.019
Token
Token ID198
Feature activation+8.11e-03

Pos logit contributions
suits+0.007
suit+0.006
nes+0.006
 wrist+0.006
utes+0.006
Neg logit contributions
ocol-0.006
 wolf-0.006
 wolves-0.006
opl-0.006
elling-0.005
TokenBut
Token ID1537
Feature activation+2.32e-02

Pos logit contributions
 wolf+0.015
 wolves+0.014
 accidents+0.014
 poison+0.014
ocol+0.014
Neg logit contributions
suits-0.013
suit-0.011
utes-0.011
oe-0.011
 hips-0.011
Token when
Token ID618
Feature activation-1.03e-02

Pos logit contributions
ocol+0.051
 wolves+0.046
elling+0.046
arter+0.045
 wolf+0.045
Neg logit contributions
suits-0.037
utes-0.035
nes-0.034
J-0.033
oe-0.033
Token she
Token ID673
Feature activation+3.31e-02

Pos logit contributions
suits+0.013
oe+0.013
J+0.013
nes+0.012
utes+0.012
Neg logit contributions
ocol-0.026
elling-0.024
opl-0.023
 wolves-0.023
arter-0.023
Token tasted
Token ID29187
Feature activation+8.59e-02

Pos logit contributions
ocol+0.078
 wolves+0.073
opl+0.071
arter+0.070
 wolf+0.069
Neg logit contributions
suits-0.046
J-0.045
oe-0.044
Anna-0.042
nes-0.041
Token it
Token ID340
Feature activation+5.07e-02

Pos logit contributions
ocol+0.223
opl+0.220
arter+0.216
 wolves+0.216
elling+0.208
Neg logit contributions
oe-0.078
J-0.075
They-0.070
suits-0.069
But-0.069
Token,
Token ID11
Feature activation+1.30e-02

Pos logit contributions
ocol+0.139
 wolves+0.128
opl+0.126
arter+0.124
 dangers+0.121
Neg logit contributions
oe-0.055
suits-0.051
J-0.045
 first-0.043
z-0.043
Token she
Token ID673
Feature activation+2.03e-02

Pos logit contributions
ocol+0.035
opl+0.034
elling+0.034
arter+0.033
 wolves+0.030
Neg logit contributions
J-0.014
suits-0.013
nes-0.013
Anna-0.012
oe-0.012
Token felt
Token ID2936
Feature activation+4.82e-02

Pos logit contributions
ocol+0.046
opl+0.040
elling+0.040
 wolves+0.040
arter+0.040
Neg logit contributions
suits-0.033
J-0.031
utes-0.030
oe-0.029
nes-0.029
Token weak
Token ID4939
Feature activation+8.10e-03

Pos logit contributions
opl+0.111
ocol+0.111
elling+0.104
arter+0.102
 wolves+0.099
Neg logit contributions
J-0.060
But-0.059
 fast-0.055
They-0.055
ends-0.054
Token eye
Token ID4151
Feature activation+3.52e-02

Pos logit contributions
ocol+0.038
elling+0.034
 wolves+0.032
arter+0.032
 destruction+0.031
Neg logit contributions
 wrist-0.024
utes-0.021
suits-0.021
oe-0.019
pack-0.018
Token.
Token ID13
Feature activation+3.28e-02

Pos logit contributions
ocol+0.092
arter+0.083
elling+0.082
 wolves+0.082
 worries+0.081
Neg logit contributions
suits-0.040
utes-0.039
 wrist-0.036
J-0.034
nes-0.034
Token He
Token ID679
Feature activation+9.71e-03

Pos logit contributions
ocol+0.060
 wolves+0.052
elling+0.051
 wolf+0.051
arter+0.051
Neg logit contributions
suits-0.065
utes-0.062
nes-0.061
J-0.060
 wrist-0.059
Token tried
Token ID3088
Feature activation+6.14e-03

Pos logit contributions
ocol+0.025
arter+0.023
elling+0.022
opl+0.022
 wolves+0.021
Neg logit contributions
suits-0.013
J-0.012
nes-0.012
Anna-0.012
 wrist-0.011
Token to
Token ID284
Feature activation+2.08e-02

Pos logit contributions
ocol+0.018
elling+0.016
arter+0.016
 wolves+0.015
opl+0.015
Neg logit contributions
utes-0.006
suits-0.006
J-0.005
 wrist-0.005
oe-0.005
Token look
Token ID804
Feature activation+8.58e-02

Pos logit contributions
ocol+0.048
 wolves+0.043
 wolf+0.043
elling+0.042
 worries+0.042
Neg logit contributions
suits-0.032
utes-0.030
nes-0.028
J-0.028
oe-0.027
Token at
Token ID379
Feature activation+3.69e-02

Pos logit contributions
ocol+0.199
opl+0.182
arter+0.180
 wolves+0.176
elling+0.174
Neg logit contributions
suits-0.129
utes-0.114
J-0.111
oe-0.111
nes-0.107
Token it
Token ID340
Feature activation+3.87e-02

Pos logit contributions
ocol+0.098
arter+0.089
elling+0.085
 wolves+0.084
 wolf+0.082
Neg logit contributions
 first-0.035
suits-0.035
 wrist-0.035
utes-0.035
J-0.033
Token,
Token ID11
Feature activation+4.12e-03

Pos logit contributions
ocol+0.093
 wolves+0.083
elling+0.082
arter+0.081
 wolf+0.081
Neg logit contributions
suits-0.055
utes-0.051
nes-0.049
J-0.048
oe-0.048
Token but
Token ID475
Feature activation+4.25e-02

Pos logit contributions
ocol+0.012
elling+0.011
arter+0.011
 wolves+0.011
opl+0.011
Neg logit contributions
suits-0.004
utes-0.003
 wrist-0.003
J-0.003
nes-0.003
Token he
Token ID339
Feature activation+1.22e-02

Pos logit contributions
ocol+0.116
elling+0.109
arter+0.105
opl+0.104
 wolves+0.102
Neg logit contributions
suits-0.044
 wrist-0.043
J-0.043
nes-0.043
But-0.042
Token fresh
Token ID4713
Feature activation+5.16e-02

Pos logit contributions
ocol+0.054
 worries+0.047
arter+0.045
 wolves+0.045
elling+0.045
Neg logit contributions
suits-0.032
oe-0.031
Mom-0.031
utes-0.030
J-0.030
Token morning
Token ID3329
Feature activation-1.87e-02

Pos logit contributions
ocol+0.117
 shouldn+0.114
arter+0.108
 worries+0.107
 awful+0.103
Neg logit contributions
suits-0.077
oe-0.069
Mom-0.061
suit-0.059
j-0.059
Token air
Token ID1633
Feature activation+3.59e-03

Pos logit contributions
suits+0.026
utes+0.025
nes+0.024
J+0.023
oe+0.022
Neg logit contributions
ocol-0.045
 wolves-0.041
 wolf-0.040
elling-0.040
 worries-0.040
Token.
Token ID13
Feature activation+4.59e-02

Pos logit contributions
ocol+0.009
 wolves+0.008
 wolf+0.008
elling+0.008
 worries+0.008
Neg logit contributions
suits-0.005
utes-0.005
nes-0.004
J-0.004
oe-0.004
Token His
Token ID2399
Feature activation+2.82e-02

Pos logit contributions
ocol+0.088
 wolves+0.076
elling+0.075
 worries+0.075
 wolf+0.074
Neg logit contributions
suits-0.088
utes-0.084
nes-0.081
J-0.080
 wrist-0.080
Token eyes
Token ID2951
Feature activation+8.55e-02

Pos logit contributions
ocol+0.065
elling+0.057
 wolves+0.057
arter+0.056
 wolf+0.056
Neg logit contributions
suits-0.044
utes-0.042
nes-0.039
 wrist-0.038
oe-0.038
Token widened
Token ID36405
Feature activation+1.09e-02

Pos logit contributions
ocol+0.202
arter+0.178
 wolves+0.177
elling+0.172
 wolf+0.170
Neg logit contributions
suits-0.131
oe-0.115
They-0.114
utes-0.113
J-0.112
Token in
Token ID287
Feature activation+2.24e-02

Pos logit contributions
ocol+0.023
 wolves+0.022
 wolf+0.022
 hunger+0.021
arter+0.021
Neg logit contributions
suits-0.018
nes-0.017
utes-0.016
J-0.016
 wrist-0.015
Token surprise
Token ID5975
Feature activation+2.14e-02

Pos logit contributions
ocol+0.054
elling+0.052
arter+0.050
 worries+0.048
 wolves+0.048
Neg logit contributions
Mom-0.027
J-0.026
 wrist-0.025
They-0.025
But-0.024
Token as
Token ID355
Feature activation-9.49e-03

Pos logit contributions
ocol+0.054
elling+0.051
opl+0.049
 destruction+0.048
arter+0.048
Neg logit contributions
suits-0.025
utes-0.024
J-0.022
oe-0.022
 wrist-0.022
Token he
Token ID339
Feature activation+2.21e-02

Pos logit contributions
suits+0.015
utes+0.015
J+0.015
nes+0.014
oe+0.014
Neg logit contributions
ocol-0.020
elling-0.018
 wolves-0.018
arter-0.018
opl-0.017
Token time
Token ID640
Feature activation-3.12e-03

Pos logit contributions
ocol+0.004
 worries+0.003
 wolves+0.003
 hunger+0.003
 wolf+0.003
Neg logit contributions
utes-0.006
suits-0.006
 wrist-0.006
nes-0.006
pack-0.005
Token there
Token ID612
Feature activation+9.53e-03

Pos logit contributions
suits+0.006
utes+0.005
nes+0.005
 wrist+0.005
J+0.005
Neg logit contributions
ocol-0.006
 wolves-0.005
 worries-0.005
 wolf-0.005
arter-0.005
Token was
Token ID373
Feature activation+1.50e-02

Pos logit contributions
ocol+0.023
elling+0.021
 wolves+0.021
arter+0.020
 wolf+0.020
Neg logit contributions
suits-0.013
J-0.013
oe-0.012
nes-0.012
utes-0.012
Token a
Token ID257
Feature activation-5.60e-03

Pos logit contributions
ocol+0.032
 wolves+0.029
 wolf+0.029
 hunger+0.029
 worries+0.028
Neg logit contributions
suits-0.025
utes-0.024
nes-0.023
 wrist-0.022
oe-0.022
Token little
Token ID1310
Feature activation+4.27e-03

Pos logit contributions
suits+0.011
nes+0.011
utes+0.011
 wrist+0.010
J+0.010
Neg logit contributions
ocol-0.010
 wolf-0.010
 wolves-0.009
arter-0.009
 hunger-0.009
Token av
Token ID1196
Feature activation+8.51e-02

Pos logit contributions
ocol+0.009
elling+0.008
 wolves+0.008
opl+0.008
 worries+0.008
Neg logit contributions
suits-0.007
utes-0.006
oe-0.006
J-0.006
nes-0.006
Tokenac
Token ID330
Feature activation+4.28e-02

Pos logit contributions
ocol+0.089
 wolves+0.069
 worries+0.067
elling+0.067
arter+0.067
Neg logit contributions
suits-0.233
utes-0.224
J-0.219
nes-0.217
oe-0.216
Tokenado
Token ID4533
Feature activation+2.68e-02

Pos logit contributions
ocol+0.066
 wolves+0.055
arter+0.055
 hunger+0.054
 wolf+0.053
Neg logit contributions
utes-0.094
suits-0.094
J-0.092
nes-0.091
oe-0.090
Token.
Token ID13
Feature activation+4.86e-02

Pos logit contributions
ocol+0.061
elling+0.055
 wolves+0.054
 worries+0.053
 hunger+0.053
Neg logit contributions
suits-0.041
utes-0.039
J-0.038
oe-0.037
nes-0.037
Token It
Token ID632
Feature activation+4.14e-02

Pos logit contributions
ocol+0.093
 wolves+0.081
elling+0.081
 worries+0.080
 wolf+0.080
Neg logit contributions
suits-0.092
utes-0.088
J-0.085
nes-0.085
 wrist-0.084
Token was
Token ID373
Feature activation-4.55e-03

Pos logit contributions
ocol+0.104
elling+0.090
arter+0.088
 wolves+0.087
 worries+0.087
Neg logit contributions
suits-0.058
J-0.056
nes-0.055
oe-0.054
Anna-0.053
Token bird
Token ID6512
Feature activation+4.46e-02

Pos logit contributions
ocol+0.048
 worries+0.042
elling+0.042
opl+0.042
 destruction+0.041
Neg logit contributions
suits-0.044
utes-0.041
oe-0.040
J-0.040
nes-0.039
Token.
Token ID13
Feature activation+4.38e-02

Pos logit contributions
ocol+0.096
 wolves+0.086
elling+0.085
 worries+0.084
 wolf+0.084
Neg logit contributions
suits-0.073
utes-0.070
nes-0.067
J-0.066
oe-0.065
Token It
Token ID632
Feature activation+4.48e-02

Pos logit contributions
ocol+0.092
 worries+0.083
elling+0.082
 wolves+0.081
 destruction+0.079
Neg logit contributions
suits-0.072
utes-0.069
J-0.067
 wrist-0.067
nes-0.065
Token loved
Token ID6151
Feature activation-1.03e-02

Pos logit contributions
ocol+0.128
elling+0.111
arter+0.110
 obstacle+0.107
 worries+0.105
Neg logit contributions
But-0.049
Anna-0.048
J-0.047
suits-0.046
They-0.046
Token to
Token ID284
Feature activation+2.57e-02

Pos logit contributions
utes+0.007
suits+0.006
They+0.005
J+0.005
 wrist+0.004
Neg logit contributions
ocol-0.032
elling-0.030
arter-0.029
opl-0.029
 worries-0.028
Token look
Token ID804
Feature activation+8.48e-02

Pos logit contributions
ocol+0.062
elling+0.057
 wolves+0.055
 worries+0.055
arter+0.054
Neg logit contributions
suits-0.034
utes-0.034
 wrist-0.032
nes-0.032
J-0.031
Token around
Token ID1088
Feature activation+2.05e-02

Pos logit contributions
ocol+0.195
opl+0.188
 destruction+0.181
arter+0.179
 worries+0.178
Neg logit contributions
suits-0.116
utes-0.115
oe-0.101
J-0.092
They-0.092
Token and
Token ID290
Feature activation+3.17e-02

Pos logit contributions
ocol+0.051
elling+0.050
 worries+0.049
arter+0.048
opl+0.047
Neg logit contributions
utes-0.022
suits-0.021
But-0.019
oe-0.018
They-0.018
Token see
Token ID766
Feature activation+5.26e-02

Pos logit contributions
ocol+0.083
elling+0.076
arter+0.073
opl+0.073
 wolves+0.072
Neg logit contributions
suits-0.037
 wrist-0.035
utes-0.033
But-0.033
J-0.031
Token all
Token ID477
Feature activation+5.36e-02

Pos logit contributions
ocol+0.130
arter+0.122
 earthquake+0.115
elling+0.115
opl+0.114
Neg logit contributions
utes-0.064
They-0.063
 together-0.060
 wrist-0.058
But-0.057
Token the
Token ID262
Feature activation+2.79e-02

Pos logit contributions
ocol+0.144
 hunger+0.133
 worries+0.133
 grizz+0.132
arter+0.131
Neg logit contributions
 with-0.045
They-0.045
oe-0.044
J-0.041
utes-0.040
Token sky
Token ID6766
Feature activation-1.09e-02

Pos logit contributions
suits+0.004
utes+0.004
J+0.004
oe+0.004
nes+0.004
Neg logit contributions
ocol-0.006
 wolves-0.005
arter-0.005
 worries-0.005
elling-0.005
Token.
Token ID13
Feature activation+2.29e-02

Pos logit contributions
suits+0.015
utes+0.015
nes+0.014
J+0.014
oe+0.013
Neg logit contributions
ocol-0.026
 wolves-0.023
elling-0.023
 wolf-0.023
 worries-0.023
Token The
Token ID383
Feature activation+1.94e-02

Pos logit contributions
ocol+0.044
 wolves+0.039
elling+0.038
 wolf+0.038
 worries+0.038
Neg logit contributions
suits-0.043
utes-0.041
nes-0.040
J-0.039
oe-0.039
Token grass
Token ID8701
Feature activation+1.69e-02

Pos logit contributions
ocol+0.036
 wolves+0.031
elling+0.031
arter+0.031
 wolf+0.031
Neg logit contributions
suits-0.038
utes-0.036
nes-0.035
J-0.035
oe-0.034
Token was
Token ID373
Feature activation-2.92e-03

Pos logit contributions
ocol+0.038
 wolves+0.034
elling+0.033
arter+0.033
 hunger+0.033
Neg logit contributions
suits-0.027
utes-0.025
oe-0.025
J-0.024
nes-0.024
Token freshly
Token ID29026
Feature activation+8.48e-02

Pos logit contributions
suits+0.004
utes+0.004
oe+0.004
nes+0.004
J+0.004
Neg logit contributions
ocol-0.007
 wolves-0.006
elling-0.006
arter-0.006
 worries-0.006
Token cut
Token ID2005
Feature activation+3.55e-02

Pos logit contributions
ocol+0.194
 wolves+0.174
elling+0.173
 wolf+0.172
 worries+0.171
Neg logit contributions
suits-0.130
utes-0.121
nes-0.119
J-0.114
oe-0.111
Token,
Token ID11
Feature activation-6.09e-03

Pos logit contributions
ocol+0.084
arter+0.076
 wolves+0.075
 hunger+0.074
elling+0.074
Neg logit contributions
suits-0.051
utes-0.047
nes-0.045
oe-0.044
J-0.044
Token and
Token ID290
Feature activation+1.34e-03

Pos logit contributions
suits+0.008
utes+0.007
nes+0.007
J+0.007
 wrist+0.007
Neg logit contributions
ocol-0.015
 wolves-0.014
elling-0.014
arter-0.014
 wolf-0.013
Token a
Token ID257
Feature activation-2.03e-02

Pos logit contributions
ocol+0.003
arter+0.003
elling+0.003
opl+0.003
 wolves+0.003
Neg logit contributions
suits-0.002
J-0.002
utes-0.001
They-0.001
 wrist-0.001
Token gentle
Token ID10296
Feature activation+8.43e-03

Pos logit contributions
suits+0.034
utes+0.032
nes+0.031
J+0.030
oe+0.030
Neg logit contributions
ocol-0.043
elling-0.039
 wolves-0.039
arter-0.038
 worries-0.037
Token I
Token ID314
Feature activation+9.82e-03

Pos logit contributions
ocol+0.003
elling+0.003
arter+0.003
 destruction+0.003
opl+0.003
Neg logit contributions
suits-0.002
 wrist-0.002
nes-0.002
utes-0.002
Anna-0.002
Token live
Token ID2107
Feature activation-4.40e-02

Pos logit contributions
ocol+0.021
elling+0.020
 wolves+0.019
arter+0.019
 earthquake+0.019
Neg logit contributions
suits-0.016
nes-0.015
utes-0.015
oe-0.014
pack-0.014
Token at
Token ID379
Feature activation-2.68e-02

Pos logit contributions
suits+0.066
utes+0.065
J+0.061
 wrist+0.061
nes+0.061
Neg logit contributions
ocol-0.094
 wolves-0.089
 wolf-0.087
 dangers-0.082
 hunger-0.081
Token 10
Token ID838
Feature activation-4.92e-03

Pos logit contributions
suits+0.044
utes+0.043
nes+0.041
J+0.040
oe+0.040
Neg logit contributions
ocol-0.057
elling-0.051
 wolves-0.050
arter-0.050
 worries-0.050
Token Green
Token ID3469
Feature activation-8.90e-04

Pos logit contributions
suits+0.008
utes+0.007
nes+0.007
J+0.007
oe+0.007
Neg logit contributions
ocol-0.011
 wolves-0.010
 wolf-0.010
elling-0.010
 worries-0.010
Token Street
Token ID3530
Feature activation-9.57e-02

Pos logit contributions
suits+0.001
utes+0.001
J+0.001
nes+0.001
 wrist+0.001
Neg logit contributions
ocol-0.002
 wolves-0.002
elling-0.002
 wolf-0.002
 worries-0.002
Token.
Token ID13
Feature activation+1.68e-02

Pos logit contributions
suits+0.145
utes+0.144
J+0.135
nes+0.134
oe+0.129
Neg logit contributions
ocol-0.210
 wolves-0.195
 wolf-0.188
elling-0.184
 hunger-0.183
Token Thank
Token ID6952
Feature activation-1.49e-02

Pos logit contributions
ocol+0.030
 wolves+0.027
elling+0.026
 wolf+0.026
arter+0.026
Neg logit contributions
suits-0.033
utes-0.032
nes-0.031
J-0.031
oe-0.030
Token you
Token ID345
Feature activation+5.83e-03

Pos logit contributions
suits+0.032
utes+0.028
nes+0.027
 wrist+0.027
suit+0.027
Neg logit contributions
ocol-0.028
 wolves-0.024
 wolf-0.023
elling-0.023
 worries-0.022
Token."
Token ID526
Feature activation-1.60e-02

Pos logit contributions
ocol+0.010
 wolves+0.009
 wolf+0.009
elling+0.009
 worries+0.009
Neg logit contributions
suits-0.012
utes-0.011
nes-0.011
J-0.011
 wrist-0.011
Token
Token ID198
Feature activation-9.33e-03

Pos logit contributions
suits+0.029
utes+0.028
nes+0.028
z+0.025
 wrist+0.025
Neg logit contributions
ocol-0.030
 wolves-0.028
elling-0.028
opl-0.028
 wolf-0.027
Token "
Token ID366
Feature activation-3.30e-02

Pos logit contributions
utes+0.009
nes+0.009
suits+0.009
aults+0.009
 wrist+0.008
Neg logit contributions
 wolves-0.006
 wolf-0.006
 grizz-0.005
ocol-0.005
der-0.005
TokenWe
Token ID1135
Feature activation-4.29e-03

Pos logit contributions
suits+0.053
utes+0.052
 hips+0.049
oe+0.048
uits+0.045
Neg logit contributions
 wolf-0.055
 dangers-0.054
 wolves-0.054
vious-0.052
der-0.051
Token live
Token ID2107
Feature activation-3.98e-02

Pos logit contributions
suits+0.007
nes+0.007
utes+0.006
oe+0.006
J+0.006
Neg logit contributions
ocol-0.009
elling-0.008
 wolves-0.008
arter-0.008
 wolf-0.008
Token on
Token ID319
Feature activation+1.30e-02

Pos logit contributions
suits+0.055
utes+0.054
nes+0.051
J+0.050
 wrist+0.050
Neg logit contributions
ocol-0.092
 wolves-0.085
 wolf-0.083
 hunger-0.081
 worries-0.081
Token Green
Token ID3469
Feature activation+6.47e-03

Pos logit contributions
ocol+0.029
elling+0.027
arter+0.026
 wolves+0.026
 wolf+0.026
Neg logit contributions
suits-0.020
utes-0.019
nes-0.018
J-0.018
oe-0.017
Token Street
Token ID3530
Feature activation-9.43e-02

Pos logit contributions
ocol+0.014
 wolves+0.012
elling+0.012
 wolf+0.012
 worries+0.012
Neg logit contributions
suits-0.011
utes-0.010
nes-0.010
J-0.010
oe-0.010
Token,
Token ID11
Feature activation-4.00e-02

Pos logit contributions
suits+0.137
utes+0.137
nes+0.127
J+0.127
oe+0.122
Neg logit contributions
ocol-0.213
 wolves-0.197
 wolf-0.191
 worries-0.190
elling-0.189
Token number
Token ID1271
Feature activation-9.24e-03

Pos logit contributions
suits+0.057
utes+0.053
nes+0.051
 wrist+0.050
J+0.049
Neg logit contributions
ocol-0.095
elling-0.086
arter-0.085
 wolves-0.084
 hunger-0.083
Token 12
Token ID1105
Feature activation+2.07e-03

Pos logit contributions
suits+0.014
utes+0.013
nes+0.013
oe+0.012
J+0.012
Neg logit contributions
ocol-0.021
arter-0.019
 wolves-0.019
 wolf-0.019
 hunger-0.019
Token.
Token ID13
Feature activation-1.88e-03

Pos logit contributions
ocol+0.005
arter+0.005
 wolf+0.005
 wolves+0.004
elling+0.004
Neg logit contributions
suits-0.003
utes-0.003
oe-0.003
nes-0.002
 wrist-0.002
Token Can
Token ID1680
Feature activation-3.30e-02

Pos logit contributions
suits+0.003
utes+0.003
nes+0.003
J+0.003
oe+0.003
Neg logit contributions
ocol-0.004
 wolves-0.003
elling-0.003
 wolf-0.003
arter-0.003
Token who
Token ID508
Feature activation-2.17e-02

Pos logit contributions
suits+0.074
utes+0.069
 wrist+0.067
J+0.067
nes+0.067
Neg logit contributions
ocol-0.141
elling-0.127
arter-0.124
 wolves-0.124
 worries-0.122
Token had
Token ID550
Feature activation-8.38e-03

Pos logit contributions
suits+0.030
J+0.028
nes+0.028
oe+0.027
Anna+0.027
Neg logit contributions
ocol-0.053
arter-0.047
 wolves-0.047
elling-0.047
 hunger-0.045
Token died
Token ID3724
Feature activation+2.60e-02

Pos logit contributions
suits+0.012
utes+0.011
nes+0.011
J+0.011
 wrist+0.011
Neg logit contributions
ocol-0.020
arter-0.018
elling-0.018
 wolves-0.018
 wolf-0.018
Token a
Token ID257
Feature activation-4.63e-02

Pos logit contributions
ocol+0.067
 wolves+0.060
elling+0.060
arter+0.059
 wolf+0.059
Neg logit contributions
suits-0.033
utes-0.030
nes-0.030
oe-0.029
J-0.029
Token year
Token ID614
Feature activation-8.05e-03

Pos logit contributions
suits+0.083
utes+0.077
J+0.076
oe+0.076
nes+0.075
Neg logit contributions
ocol-0.095
elling-0.086
 wolves-0.082
arter-0.081
 worries-0.080
Token ago
Token ID2084
Feature activation-9.32e-02

Pos logit contributions
suits+0.012
utes+0.011
nes+0.011
J+0.011
oe+0.010
Neg logit contributions
ocol-0.019
 wolves-0.017
elling-0.017
 wolf-0.017
 worries-0.016
Token.
Token ID13
Feature activation-4.25e-03

Pos logit contributions
utes+0.166
suits+0.162
nes+0.155
J+0.154
Anna+0.150
Neg logit contributions
ocol-0.177
 wolves-0.155
 hunger-0.155
 worries-0.155
 earthquake-0.154
Token It
Token ID632
Feature activation+7.85e-03

Pos logit contributions
suits+0.006
utes+0.006
nes+0.005
J+0.005
oe+0.005
Neg logit contributions
ocol-0.010
 wolves-0.009
elling-0.009
 wolf-0.009
arter-0.009
Token was
Token ID373
Feature activation-2.06e-02

Pos logit contributions
ocol+0.022
arter+0.019
opl+0.018
elling+0.018
 obstacle+0.018
Neg logit contributions
suits-0.009
oe-0.009
nes-0.009
Anna-0.008
They-0.008
Token very
Token ID845
Feature activation+2.14e-02

Pos logit contributions
oe+0.021
nes+0.019
utes+0.019
suits+0.019
J+0.019
Neg logit contributions
ocol-0.053
elling-0.050
opl-0.050
 obstacle-0.049
 wolves-0.048
Token precious
Token ID14186
Feature activation+1.04e-02

Pos logit contributions
ocol+0.042
 wolves+0.039
opl+0.039
elling+0.038
 wolf+0.038
Neg logit contributions
oe-0.037
suits-0.036
utes-0.035
nes-0.034
J-0.034
Token he
Token ID339
Feature activation-6.68e-03

Pos logit contributions
suits+0.028
nes+0.026
utes+0.026
J+0.026
oe+0.025
Neg logit contributions
ocol-0.041
elling-0.037
arter-0.036
 wolves-0.036
 worries-0.035
Token had
Token ID550
Feature activation-1.59e-02

Pos logit contributions
suits+0.012
utes+0.011
nes+0.011
J+0.011
 wrist+0.010
Neg logit contributions
ocol-0.014
 wolves-0.012
 wolf-0.012
elling-0.012
 worries-0.012
Token lost
Token ID2626
Feature activation-8.61e-03

Pos logit contributions
suits+0.029
utes+0.027
nes+0.026
J+0.026
oe+0.025
Neg logit contributions
ocol-0.032
 wolves-0.028
 worries-0.028
 wolf-0.028
elling-0.028
Token a
Token ID257
Feature activation-4.65e-02

Pos logit contributions
suits+0.014
utes+0.013
nes+0.012
J+0.012
 wrist+0.012
Neg logit contributions
ocol-0.019
 wolves-0.017
elling-0.017
 wolf-0.017
arter-0.017
Token week
Token ID1285
Feature activation-5.82e-02

Pos logit contributions
suits+0.072
utes+0.068
nes+0.065
J+0.064
oe+0.063
Neg logit contributions
ocol-0.105
 wolves-0.095
elling-0.094
 wolf-0.092
 worries-0.092
Token ago
Token ID2084
Feature activation-9.30e-02

Pos logit contributions
suits+0.081
utes+0.078
nes+0.074
J+0.073
oe+0.070
Neg logit contributions
ocol-0.138
 wolves-0.126
 wolf-0.124
elling-0.123
 worries-0.123
Token.
Token ID13
Feature activation+6.34e-04

Pos logit contributions
utes+0.165
suits+0.160
J+0.157
nes+0.154
Anna+0.153
Neg logit contributions
ocol-0.172
 wolves-0.154
 hunger-0.153
 earthquake-0.151
 worries-0.151
Token He
Token ID679
Feature activation-1.60e-02

Pos logit contributions
ocol+0.001
 wolves+0.001
elling+0.001
 worries+0.001
 wolf+0.001
Neg logit contributions
suits-0.001
utes-0.001
J-0.001
nes-0.001
 wrist-0.001
Token wanted
Token ID2227
Feature activation-8.81e-03

Pos logit contributions
suits+0.026
utes+0.024
nes+0.023
J+0.023
oe+0.022
Neg logit contributions
ocol-0.036
elling-0.032
 wolves-0.032
arter-0.032
 wolf-0.031
Token to
Token ID284
Feature activation+8.53e-03

Pos logit contributions
suits+0.010
utes+0.009
J+0.008
oe+0.008
nes+0.008
Neg logit contributions
ocol-0.024
elling-0.022
arter-0.022
 wolves-0.022
opl-0.022
Token get
Token ID651
Feature activation+2.59e-02

Pos logit contributions
ocol+0.021
 wolves+0.019
elling+0.019
 wolf+0.018
arter+0.018
Neg logit contributions
suits-0.012
utes-0.011
nes-0.011
 wrist-0.011
J-0.010
Token it
Token ID340
Feature activation-4.48e-04

Pos logit contributions
suits+0.041
utes+0.039
nes+0.037
J+0.037
oe+0.037
Neg logit contributions
ocol-0.072
 wolves-0.066
elling-0.066
arter-0.065
 worries-0.065
Token here
Token ID994
Feature activation-3.26e-02

Pos logit contributions
suits+0.001
utes+0.001
nes+0.001
oe+0.001
J+0.001
Neg logit contributions
ocol-0.001
 wolves-0.001
elling-0.001
 wolf-0.001
 worries-0.001
Token a
Token ID257
Feature activation-5.95e-02

Pos logit contributions
suits+0.050
utes+0.049
J+0.046
nes+0.046
oe+0.044
Neg logit contributions
ocol-0.072
 wolves-0.066
elling-0.064
 wolf-0.064
 hunger-0.064
Token long
Token ID890
Feature activation-4.22e-02

Pos logit contributions
suits+0.111
utes+0.105
nes+0.101
J+0.100
oe+0.099
Neg logit contributions
ocol-0.116
elling-0.103
 wolves-0.101
 worries-0.099
arter-0.099
Token time
Token ID640
Feature activation-7.09e-02

Pos logit contributions
suits+0.072
utes+0.070
nes+0.068
J+0.066
oe+0.064
Neg logit contributions
ocol-0.087
 wolves-0.079
 wolf-0.078
 hunger-0.077
 worries-0.076
Token ago
Token ID2084
Feature activation-9.24e-02

Pos logit contributions
suits+0.143
utes+0.141
J+0.136
nes+0.135
 wrist+0.131
Neg logit contributions
ocol-0.122
 wolves-0.108
 wolf-0.103
 worries-0.103
 hunger-0.103
Token."
Token ID526
Feature activation-2.21e-02

Pos logit contributions
utes+0.159
J+0.155
suits+0.155
nes+0.150
Anna+0.141
Neg logit contributions
ocol-0.181
 wolves-0.162
 wolf-0.155
 worries-0.155
 hunger-0.154
Token
Token ID198
Feature activation-7.19e-03

Pos logit contributions
utes+0.029
suits+0.028
nes+0.028
 wrist+0.025
pack+0.024
Neg logit contributions
ocol-0.053
 wolves-0.051
 wolf-0.051
 hunger-0.049
 worries-0.049
Token
Token ID198
Feature activation-1.03e-03

Pos logit contributions
suits+0.009
nes+0.009
utes+0.008
 wrist+0.008
suit+0.008
Neg logit contributions
ocol-0.017
 wolves-0.016
 wolf-0.016
elling-0.016
opl-0.016
Token"
Token ID1
Feature activation-2.52e-02

Pos logit contributions
utes+0.001
nes+0.001
suits+0.001
 wrist+0.001
 hips+0.001
Neg logit contributions
 wolves-0.002
ocol-0.002
 dangers-0.002
 wolf-0.002
elling-0.002
TokenMaybe
Token ID13300
Feature activation+1.21e-02

Pos logit contributions
utes+0.044
suits+0.042
nes+0.040
oe+0.039
 hips+0.038
Neg logit contributions
 wolf-0.045
 dangers-0.044
 wolves-0.044
ocol-0.043
 ow-0.041
Token can
Token ID460
Feature activation-3.81e-02

Pos logit contributions
suits+0.060
utes+0.058
nes+0.057
J+0.056
 wrist+0.055
Neg logit contributions
ocol-0.051
 wolves-0.045
elling-0.045
 worries-0.044
arter-0.043
Token be
Token ID307
Feature activation-9.47e-03

Pos logit contributions
suits+0.053
utes+0.052
nes+0.051
oe+0.049
J+0.049
Neg logit contributions
ocol-0.089
 wolves-0.081
elling-0.080
 wolf-0.080
arter-0.079
Token more
Token ID517
Feature activation-1.91e-02

Pos logit contributions
suits+0.012
They+0.011
J+0.011
 together+0.011
oe+0.010
Neg logit contributions
ocol-0.025
opl-0.022
arter-0.022
elling-0.022
 wolf-0.021
Token fun
Token ID1257
Feature activation-6.55e-02

Pos logit contributions
suits+0.034
utes+0.033
nes+0.031
J+0.031
oe+0.030
Neg logit contributions
ocol-0.039
 wolves-0.035
 wolf-0.034
elling-0.034
arter-0.034
Token than
Token ID621
Feature activation-7.77e-03

Pos logit contributions
suits+0.095
utes+0.090
nes+0.086
J+0.084
oe+0.082
Neg logit contributions
ocol-0.154
 wolves-0.140
elling-0.137
 wolf-0.137
 worries-0.137
Token competing
Token ID11780
Feature activation-9.07e-02

Pos logit contributions
oe+0.012
utes+0.012
suits+0.012
J+0.011
nes+0.011
Neg logit contributions
ocol-0.016
elling-0.015
opl-0.015
 wolf-0.015
 wolves-0.014
Token.
Token ID13
Feature activation-9.62e-03

Pos logit contributions
suits+0.133
utes+0.125
nes+0.120
J+0.118
 wrist+0.116
Neg logit contributions
ocol-0.209
 wolves-0.190
elling-0.187
 wolf-0.185
arter-0.185
Token They
Token ID1119
Feature activation-2.95e-02

Pos logit contributions
suits+0.015
nes+0.014
utes+0.014
oe+0.013
 wrist+0.013
Neg logit contributions
ocol-0.021
 wolves-0.020
arter-0.019
elling-0.019
 wolf-0.019
Token are
Token ID389
Feature activation-2.51e-02

Pos logit contributions
suits+0.051
utes+0.048
nes+0.048
J+0.047
 wrist+0.046
Neg logit contributions
ocol-0.062
 wolves-0.055
elling-0.054
 wolf-0.053
arter-0.053
Token good
Token ID922
Feature activation-2.32e-02

Pos logit contributions
suits+0.035
utes+0.034
nes+0.033
J+0.032
 wrist+0.032
Neg logit contributions
ocol-0.060
 wolves-0.053
elling-0.053
 wolf-0.053
arter-0.052
Token brothers
Token ID9397
Feature activation-5.16e-02

Pos logit contributions
suits+0.048
utes+0.046
nes+0.045
J+0.044
oe+0.043
Neg logit contributions
ocol-0.041
 wolves-0.036
elling-0.035
 wolf-0.035
 worries-0.035
Token,
Token ID11
Feature activation-4.73e-02

Pos logit contributions
suits+0.008
utes+0.007
nes+0.007
J+0.007
 wrist+0.007
Neg logit contributions
ocol-0.009
 wolves-0.008
 worries-0.008
 wolf-0.008
elling-0.008
Token
Token ID6184
Feature activation-3.18e-02

Pos logit contributions
suits+0.098
utes+0.089
nes+0.088
pack+0.082
oe+0.082
Neg logit contributions
ocol-0.082
 wolf-0.073
 wolves-0.069
 hunger-0.069
elling-0.067
Token
Token ID95
Feature activation-5.90e-02

Pos logit contributions
suits+0.050
utes+0.047
J+0.046
 wrist+0.044
nes+0.042
Neg logit contributions
ocol-0.070
elling-0.061
opl-0.059
arter-0.055
 emergency-0.055
Token
Token ID26391
Feature activation-4.58e-02

Pos logit contributions
suits+0.102
utes+0.102
J+0.088
nes+0.088
 wrist+0.088
Neg logit contributions
ocol-0.114
opl-0.111
elling-0.108
 emergency-0.095
 worries-0.094
Token
Token ID129
Feature activation-2.29e-02

Pos logit contributions
utes+0.065
suits+0.064
nes+0.062
oe+0.058
J+0.058
Neg logit contributions
opl-0.094
ocol-0.093
 worries-0.088
 wolves-0.087
 wolf-0.087
Token
Token ID241
Feature activation-9.07e-02

Pos logit contributions
suits+0.045
utes+0.041
J+0.040
nes+0.039
oe+0.039
Neg logit contributions
ocol-0.043
opl-0.039
elling-0.038
 hunger-0.037
 wolves-0.037
TokenI
Token ID40
Feature activation-3.20e-03

Pos logit contributions
suits+0.136
utes+0.127
nes+0.119
 wrist+0.115
pack+0.113
Neg logit contributions
ocol-0.204
 wolves-0.183
 wolf-0.181
 worries-0.179
elling-0.178
Token love
Token ID1842
Feature activation-4.73e-02

Pos logit contributions
suits+0.006
utes+0.006
nes+0.005
 wrist+0.005
J+0.005
Neg logit contributions
ocol-0.006
 wolves-0.005
elling-0.005
 worries-0.005
 wolf-0.005
Token baking
Token ID16871
Feature activation-1.28e-02

Pos logit contributions
suits+0.076
utes+0.072
nes+0.070
J+0.068
oe+0.067
Neg logit contributions
ocol-0.103
 wolves-0.093
elling-0.091
 wolf-0.091
arter-0.090
Token for
Token ID329
Feature activation-2.07e-03

Pos logit contributions
suits+0.022
utes+0.021
nes+0.020
J+0.020
oe+0.019
Neg logit contributions
ocol-0.026
 wolves-0.023
elling-0.023
 wolf-0.023
 worries-0.023
Token you
Token ID345
Feature activation-6.25e-03

Pos logit contributions
suits+0.003
utes+0.003
J+0.003
nes+0.003
 wrist+0.003
Neg logit contributions
ocol-0.004
elling-0.004
 wolves-0.004
arter-0.004
 wolf-0.004
Token lost
Token ID2626
Feature activation-1.17e-02

Pos logit contributions
ocol+0.008
 wolves+0.007
elling+0.007
arter+0.007
 wolf+0.007
Neg logit contributions
suits-0.007
nes-0.006
utes-0.006
J-0.006
oe-0.006
Token it
Token ID340
Feature activation+3.12e-03

Pos logit contributions
suits+0.018
utes+0.018
nes+0.017
J+0.017
 wrist+0.017
Neg logit contributions
ocol-0.025
 wolves-0.023
elling-0.023
arter-0.023
 wolf-0.022
Token a
Token ID257
Feature activation-5.61e-02

Pos logit contributions
ocol+0.007
 wolves+0.006
 wolf+0.006
elling+0.006
 worries+0.006
Neg logit contributions
suits-0.005
utes-0.005
J-0.004
nes-0.004
 wrist-0.004
Token long
Token ID890
Feature activation-4.76e-02

Pos logit contributions
suits+0.105
utes+0.101
nes+0.098
J+0.096
oe+0.094
Neg logit contributions
ocol-0.107
 wolves-0.095
 wolf-0.093
elling-0.093
 worries-0.092
Token time
Token ID640
Feature activation-7.42e-02

Pos logit contributions
suits+0.078
utes+0.075
nes+0.072
J+0.071
oe+0.067
Neg logit contributions
ocol-0.103
 wolf-0.092
 wolves-0.091
 hunger-0.090
 worries-0.090
Token ago
Token ID2084
Feature activation-8.98e-02

Pos logit contributions
suits+0.146
utes+0.141
J+0.137
nes+0.136
 wrist+0.131
Neg logit contributions
ocol-0.134
 wolves-0.117
 worries-0.114
 hunger-0.113
 wolf-0.113
Token.
Token ID13
Feature activation-3.54e-03

Pos logit contributions
utes+0.154
J+0.151
suits+0.150
nes+0.144
Anna+0.138
Neg logit contributions
ocol-0.175
 wolves-0.155
 worries-0.154
 hunger-0.153
 wolf-0.150
Token I
Token ID314
Feature activation+7.88e-03

Pos logit contributions
suits+0.006
utes+0.006
nes+0.006
J+0.006
 wrist+0.006
Neg logit contributions
ocol-0.007
 wolves-0.006
 worries-0.006
 wolf-0.006
elling-0.006
Token thought
Token ID1807
Feature activation-8.98e-03

Pos logit contributions
ocol+0.017
 wolves+0.015
elling+0.015
arter+0.015
 wolf+0.015
Neg logit contributions
suits-0.013
nes-0.012
utes-0.012
oe-0.012
J-0.012
Token I
Token ID314
Feature activation-2.83e-03

Pos logit contributions
suits+0.012
utes+0.012
nes+0.011
 wrist+0.011
J+0.010
Neg logit contributions
ocol-0.022
 wolves-0.019
elling-0.019
arter-0.019
 wolf-0.019
Token would
Token ID561
Feature activation-5.08e-02

Pos logit contributions
suits+0.004
nes+0.004
J+0.004
utes+0.004
oe+0.004
Neg logit contributions
ocol-0.007
 wolves-0.006
elling-0.006
arter-0.006
 wolf-0.006
Token best
Token ID1266
Feature activation-3.00e-02

Pos logit contributions
suits+0.027
utes+0.024
 wrist+0.024
oe+0.023
nes+0.023
Neg logit contributions
ocol-0.053
elling-0.048
 wolves-0.048
arter-0.047
 wolf-0.046
Token friend
Token ID1545
Feature activation-2.30e-02

Pos logit contributions
suits+0.049
utes+0.047
nes+0.045
J+0.045
oe+0.043
Neg logit contributions
ocol-0.064
 wolves-0.058
 wolf-0.057
 worries-0.057
elling-0.056
Token a
Token ID257
Feature activation-4.72e-02

Pos logit contributions
suits+0.034
utes+0.033
J+0.031
nes+0.031
oe+0.029
Neg logit contributions
ocol-0.053
 wolves-0.048
 wolf-0.048
 worries-0.047
elling-0.047
Token long
Token ID890
Feature activation-3.92e-02

Pos logit contributions
suits+0.088
utes+0.083
nes+0.081
J+0.079
oe+0.079
Neg logit contributions
ocol-0.092
 wolves-0.081
elling-0.080
 wolf-0.079
arter-0.079
Token time
Token ID640
Feature activation-6.83e-02

Pos logit contributions
suits+0.069
utes+0.069
nes+0.066
J+0.064
pack+0.061
Neg logit contributions
ocol-0.076
 wolf-0.070
 wolves-0.069
 worries-0.068
 hunger-0.068
Token ago
Token ID2084
Feature activation-8.93e-02

Pos logit contributions
suits+0.109
utes+0.105
nes+0.099
J+0.099
 wrist+0.095
Neg logit contributions
ocol-0.149
 wolves-0.134
 worries-0.131
 wolf-0.130
arter-0.130
Token.
Token ID13
Feature activation+2.76e-03

Pos logit contributions
utes+0.158
suits+0.153
J+0.152
nes+0.147
Anna+0.143
Neg logit contributions
ocol-0.167
 wolves-0.148
 worries-0.146
 hunger-0.144
 earthquake-0.143
Token He
Token ID679
Feature activation-1.47e-02

Pos logit contributions
ocol+0.006
 wolves+0.005
 wolf+0.005
elling+0.005
 hunger+0.005
Neg logit contributions
suits-0.005
utes-0.005
nes-0.005
J-0.004
 wrist-0.004
Token was
Token ID373
Feature activation-1.60e-02

Pos logit contributions
suits+0.025
nes+0.023
oe+0.023
utes+0.022
 wrist+0.022
Neg logit contributions
ocol-0.033
arter-0.028
 wolves-0.027
elling-0.027
 earthquake-0.027
Token my
Token ID616
Feature activation-4.43e-03

Pos logit contributions
suits+0.024
nes+0.024
utes+0.024
oe+0.023
J+0.023
Neg logit contributions
ocol-0.035
elling-0.031
 wolves-0.030
arter-0.030
 worries-0.030
Token dog
Token ID3290
Feature activation+3.11e-03

Pos logit contributions
suits+0.008
utes+0.007
nes+0.007
J+0.007
oe+0.007
Neg logit contributions
ocol-0.009
 wolves-0.008
elling-0.008
 wolf-0.008
 worries-0.008
Token,
Token ID11
Feature activation-4.63e-02

Pos logit contributions
suits+0.004
utes+0.004
nes+0.004
J+0.004
 wrist+0.004
Neg logit contributions
ocol-0.004
 wolf-0.004
 worries-0.004
 wolves-0.004
elling-0.004
Token
Token ID6184
Feature activation-2.65e-02

Pos logit contributions
suits+0.106
nes+0.098
utes+0.098
pack+0.093
 wrist+0.089
Neg logit contributions
ocol-0.064
 wolf-0.062
 wolves-0.055
der-0.054
 accidents-0.054
Token
Token ID95
Feature activation-5.17e-02

Pos logit contributions
suits+0.044
utes+0.041
J+0.038
Mom+0.037
They+0.036
Neg logit contributions
ocol-0.058
elling-0.050
arter-0.048
 earthquake-0.048
opl-0.048
Token
Token ID26391
Feature activation-3.70e-02

Pos logit contributions
suits+0.090
utes+0.087
oe+0.076
J+0.074
 wrist+0.073
Neg logit contributions
ocol-0.102
elling-0.097
opl-0.096
 earthquake-0.089
 destruction-0.087
Token
Token ID129
Feature activation-1.88e-02

Pos logit contributions
suits+0.058
utes+0.058
nes+0.054
J+0.053
pack+0.052
Neg logit contributions
opl-0.068
 worries-0.067
ocol-0.067
 shouldn-0.063
 wolves-0.063
Token
Token ID241
Feature activation-8.84e-02

Pos logit contributions
suits+0.038
utes+0.036
J+0.035
nes+0.034
oe+0.034
Neg logit contributions
ocol-0.034
opl-0.030
elling-0.030
 wolves-0.029
 hunger-0.029
TokenWe
Token ID1135
Feature activation-2.12e-02

Pos logit contributions
suits+0.130
utes+0.124
nes+0.113
pack+0.111
 wrist+0.109
Neg logit contributions
ocol-0.190
 wolves-0.175
 wolf-0.174
 worries-0.172
 hunger-0.169
Token all
Token ID477
Feature activation+1.46e-02

Pos logit contributions
suits+0.037
nes+0.035
utes+0.035
J+0.034
oe+0.033
Neg logit contributions
ocol-0.044
 wolves-0.039
arter-0.039
elling-0.039
 wolf-0.038
Token wish
Token ID4601
Feature activation-6.36e-03

Pos logit contributions
ocol+0.034
arter+0.031
 wolves+0.031
 wolf+0.030
elling+0.030
Neg logit contributions
suits-0.022
nes-0.022
oe-0.020
 wrist-0.019
utes-0.019
Token for
Token ID329
Feature activation+5.25e-03

Pos logit contributions
suits+0.008
utes+0.008
nes+0.008
oe+0.007
J+0.007
Neg logit contributions
ocol-0.016
 wolves-0.014
elling-0.014
 wolf-0.014
arter-0.014
Token lux
Token ID10632
Feature activation-2.53e-02

Pos logit contributions
ocol+0.013
arter+0.012
 wolves+0.012
elling+0.012
 wolf+0.012
Neg logit contributions
suits-0.006
utes-0.006
 wrist-0.006
nes-0.006
J-0.006
Token Now
Token ID2735
Feature activation-2.11e-02

Pos logit contributions
suits+0.001
utes+0.001
nes+0.001
J+0.001
oe+0.001
Neg logit contributions
ocol-0.001
 wolves-0.001
elling-0.001
 wolf-0.001
arter-0.001
Token you
Token ID345
Feature activation+1.64e-02

Pos logit contributions
suits+0.032
utes+0.030
nes+0.029
J+0.028
oe+0.028
Neg logit contributions
ocol-0.048
 wolves-0.043
elling-0.043
 wolf-0.043
arter-0.042
Token have
Token ID423
Feature activation-3.24e-03

Pos logit contributions
ocol+0.035
 wolves+0.030
elling+0.030
arter+0.030
 wolf+0.030
Neg logit contributions
suits-0.029
nes-0.027
utes-0.026
oe-0.026
J-0.025
Token nothing
Token ID2147
Feature activation+1.67e-02

Pos logit contributions
suits+0.004
nes+0.004
utes+0.004
oe+0.003
J+0.003
Neg logit contributions
ocol-0.008
arter-0.008
elling-0.008
opl-0.008
 wolves-0.007
Token to
Token ID284
Feature activation-2.84e-02

Pos logit contributions
ocol+0.044
arter+0.039
elling+0.039
 wolves+0.039
 hunger+0.038
Neg logit contributions
suits-0.021
nes-0.019
utes-0.018
oe-0.018
Anna-0.017
Token play
Token ID711
Feature activation-8.82e-02

Pos logit contributions
suits+0.046
utes+0.044
nes+0.041
J+0.041
oe+0.040
Neg logit contributions
ocol-0.062
 wolves-0.056
 wolf-0.055
arter-0.055
 worries-0.055
Token with
Token ID351
Feature activation-2.63e-02

Pos logit contributions
suits+0.115
utes+0.108
nes+0.098
oe+0.097
J+0.096
Neg logit contributions
ocol-0.214
 wolves-0.199
 wolf-0.196
 emergency-0.194
arter-0.193
Token."
Token ID526
Feature activation-2.97e-02

Pos logit contributions
suits+0.037
utes+0.035
nes+0.034
J+0.033
oe+0.032
Neg logit contributions
ocol-0.063
 wolves-0.057
elling-0.056
 wolf-0.056
arter-0.056
Token
Token ID198
Feature activation-1.42e-02

Pos logit contributions
nes+0.049
suits+0.047
utes+0.046
 wrist+0.044
pack+0.042
Neg logit contributions
ocol-0.061
 wolves-0.058
 wolf-0.057
 hunger-0.055
 worries-0.054
Token
Token ID198
Feature activation-8.68e-03

Pos logit contributions
suits+0.019
nes+0.019
 wrist+0.017
utes+0.017
suit+0.016
Neg logit contributions
ocol-0.034
 wolves-0.032
 wolf-0.031
opl-0.031
 hunger-0.031
TokenBen
Token ID11696
Feature activation-2.06e-02

Pos logit contributions
suits+0.009
nes+0.009
utes+0.008
 wrist+0.008
oe+0.007
Neg logit contributions
ocol-0.024
 wolves-0.022
 wolf-0.022
 dangers-0.022
arter-0.021
Token,
Token ID11
Feature activation-4.75e-02

Pos logit contributions
suits+0.001
utes+0.001
nes+0.001
J+0.001
oe+0.001
Neg logit contributions
ocol-0.002
elling-0.002
arter-0.002
 wolves-0.002
opl-0.002
Token
Token ID6184
Feature activation-1.75e-02

Pos logit contributions
suits+0.083
utes+0.080
nes+0.077
J+0.074
oe+0.073
Neg logit contributions
ocol-0.097
 wolf-0.089
 wolves-0.089
 hunger-0.085
elling-0.085
Token
Token ID95
Feature activation-4.25e-02

Pos logit contributions
suits+0.030
utes+0.029
J+0.027
nes+0.027
oe+0.025
Neg logit contributions
ocol-0.037
elling-0.032
opl-0.032
arter-0.031
 wolf-0.031
Token
Token ID26391
Feature activation-3.53e-02

Pos logit contributions
suits+0.071
utes+0.070
J+0.062
nes+0.062
oe+0.060
Neg logit contributions
ocol-0.088
opl-0.083
elling-0.082
 wolves-0.077
 wolf-0.076
Token
Token ID129
Feature activation-1.73e-02

Pos logit contributions
utes+0.052
suits+0.049
nes+0.049
J+0.045
oe+0.043
Neg logit contributions
opl-0.073
ocol-0.070
 wolf-0.069
 worries-0.069
 shouldn-0.069
Token
Token ID241
Feature activation-8.82e-02

Pos logit contributions
suits+0.034
utes+0.032
J+0.031
nes+0.031
oe+0.030
Neg logit contributions
ocol-0.031
opl-0.029
elling-0.028
 wolves-0.028
 wolf-0.028
TokenIt
Token ID1026
Feature activation+1.93e-02

Pos logit contributions
suits+0.121
utes+0.118
nes+0.107
 wrist+0.102
oe+0.098
Neg logit contributions
ocol-0.205
 wolf-0.191
 wolves-0.191
 hunger-0.186
 worries-0.185
Tokenâ
Token ID22940
Feature activation-1.11e-02

Pos logit contributions
ocol+0.056
arter+0.047
 hunger+0.046
opl+0.045
 earthquake+0.044
Neg logit contributions
a-0.021
suits-0.021
They-0.020
oe-0.020
z-0.020
Token
Token ID26391
Feature activation-4.49e-02

Pos logit contributions
suits+0.019
utes+0.019
nes+0.017
J+0.017
 wrist+0.017
Neg logit contributions
ocol-0.023
elling-0.021
 wolves-0.021
opl-0.020
 wolf-0.020
Token
Token ID8151
Feature activation-7.18e-03

Pos logit contributions
utes+0.068
suits+0.065
nes+0.063
J+0.060
pack+0.058
Neg logit contributions
ocol-0.091
opl-0.088
 wolf-0.086
 wolves-0.086
 worries-0.086
Tokens
Token ID82
Feature activation+6.96e-03

Pos logit contributions
suits+0.012
utes+0.012
nes+0.011
 wrist+0.011
J+0.011
Neg logit contributions
ocol-0.015
elling-0.013
 wolves-0.013
 worries-0.013
arter-0.013
Token that
Token ID326
Feature activation-1.12e-02

Pos logit contributions
suits+0.006
utes+0.006
J+0.006
nes+0.006
 wrist+0.006
Neg logit contributions
ocol-0.009
 wolves-0.008
 wolf-0.008
arter-0.008
elling-0.007
Token sharing
Token ID7373
Feature activation-8.25e-03

Pos logit contributions
utes+0.017
suits+0.017
 wrist+0.016
J+0.016
nes+0.016
Neg logit contributions
ocol-0.025
elling-0.022
arter-0.021
 wolf-0.021
opl-0.021
Token is
Token ID318
Feature activation-2.56e-02

Pos logit contributions
suits+0.014
utes+0.014
J+0.013
 wrist+0.013
nes+0.013
Neg logit contributions
ocol-0.017
 wolves-0.015
 wolf-0.015
arter-0.015
elling-0.015
Token better
Token ID1365
Feature activation-2.99e-02

Pos logit contributions
utes+0.038
They+0.037
suits+0.037
J+0.036
ends+0.035
Neg logit contributions
ocol-0.060
elling-0.052
arter-0.052
opl-0.051
 wolves-0.050
Token than
Token ID621
Feature activation-4.80e-03

Pos logit contributions
suits+0.049
utes+0.047
nes+0.045
J+0.044
oe+0.043
Neg logit contributions
ocol-0.065
 wolves-0.058
elling-0.058
 wolf-0.058
arter-0.057
Token competing
Token ID11780
Feature activation-8.81e-02

Pos logit contributions
utes+0.008
suits+0.007
oe+0.007
J+0.007
nes+0.007
Neg logit contributions
ocol-0.010
opl-0.010
elling-0.009
 wolf-0.009
 wolves-0.009
Token.
Token ID13
Feature activation-1.11e-02

Pos logit contributions
suits+0.132
utes+0.124
nes+0.123
J+0.119
 wrist+0.116
Neg logit contributions
ocol-0.196
 wolves-0.179
elling-0.176
 wolf-0.174
 worries-0.173
Token They
Token ID1119
Feature activation-2.66e-02

Pos logit contributions
suits+0.016
nes+0.015
utes+0.015
oe+0.014
z+0.014
Neg logit contributions
ocol-0.025
 wolves-0.023
elling-0.023
arter-0.023
 wolf-0.022
Token learn
Token ID2193
Feature activation-4.13e-03

Pos logit contributions
suits+0.046
utes+0.044
nes+0.043
J+0.042
 wrist+0.041
Neg logit contributions
ocol-0.055
 wolves-0.049
elling-0.048
 wolf-0.048
arter-0.048
Token that
Token ID326
Feature activation-1.20e-02

Pos logit contributions
suits+0.007
utes+0.007
nes+0.007
J+0.007
 wrist+0.006
Neg logit contributions
ocol-0.009
 wolves-0.008
 wolf-0.007
elling-0.007
arter-0.007
Token purple
Token ID14032
Feature activation-5.26e-03

Pos logit contributions
utes+0.019
suits+0.019
 wrist+0.018
J+0.018
nes+0.017
Neg logit contributions
ocol-0.026
elling-0.023
arter-0.023
 wolves-0.023
 wolf-0.022
Token have
Token ID423
Feature activation-6.46e-03

Pos logit contributions
nes+0.052
suits+0.051
oe+0.049
They+0.049
utes+0.048
Neg logit contributions
ocol-0.089
elling-0.083
 wolves-0.079
 worries-0.077
 wolf-0.076
Token no
Token ID645
Feature activation+9.38e-03

Pos logit contributions
suits+0.009
nes+0.009
J+0.009
 wrist+0.009
They+0.009
Neg logit contributions
ocol-0.015
arter-0.013
elling-0.013
opl-0.013
 wolves-0.013
Token cookie
Token ID19751
Feature activation-2.77e-02

Pos logit contributions
ocol+0.020
elling+0.018
arter+0.017
opl+0.017
 wolves+0.017
Neg logit contributions
suits-0.015
 wrist-0.014
oe-0.014
utes-0.014
nes-0.014
Token and
Token ID290
Feature activation-2.47e-02

Pos logit contributions
suits+0.040
utes+0.038
nes+0.036
J+0.035
oe+0.035
Neg logit contributions
ocol-0.065
 wolves-0.059
elling-0.058
 wolf-0.058
 worries-0.058
Token no
Token ID645
Feature activation+8.89e-03

Pos logit contributions
 wrist+0.031
nes+0.029
suits+0.029
They+0.029
Anna+0.028
Neg logit contributions
ocol-0.059
elling-0.056
opl-0.054
arter-0.053
 wolf-0.050
Token play
Token ID711
Feature activation-8.75e-02

Pos logit contributions
ocol+0.019
elling+0.018
arter+0.017
opl+0.017
 wolves+0.015
Neg logit contributions
suits-0.014
 wrist-0.013
oe-0.013
utes-0.012
Anna-0.012
Tokentime
Token ID2435
Feature activation-8.22e-02

Pos logit contributions
suits+0.089
utes+0.082
nes+0.074
J+0.072
oe+0.070
Neg logit contributions
ocol-0.245
 wolves-0.227
 wolf-0.224
arter-0.221
 hunger-0.221
Token."
Token ID526
Feature activation-4.27e-02

Pos logit contributions
suits+0.137
utes+0.132
nes+0.124
J+0.123
 wrist+0.120
Neg logit contributions
ocol-0.173
 wolves-0.157
 wolf-0.153
 worries-0.153
 hunger-0.152
Token
Token ID198
Feature activation-2.27e-02

Pos logit contributions
nes+0.069
suits+0.069
z+0.062
utes+0.061
pack+0.061
Neg logit contributions
ocol-0.089
 wolf-0.086
 wolves-0.086
 hunger-0.083
 accidents-0.082
Token
Token ID198
Feature activation-1.84e-02

Pos logit contributions
suits+0.033
nes+0.032
 wrist+0.030
utes+0.029
suit+0.029
Neg logit contributions
ocol-0.050
 wolves-0.048
 wolf-0.048
opl-0.046
 ow-0.046
TokenBen
Token ID11696
Feature activation-2.38e-02

Pos logit contributions
suits+0.021
nes+0.021
utes+0.019
 wrist+0.019
suit+0.018
Neg logit contributions
ocol-0.046
 wolves-0.043
 wolf-0.043
 dangers-0.042
 accidents-0.042
Token had
Token ID550
Feature activation-1.43e-02

Pos logit contributions
suits+0.038
utes+0.036
nes+0.036
They+0.035
J+0.035
Neg logit contributions
ocol-0.064
arter-0.056
 wolves-0.055
 hunger-0.054
elling-0.054
Token lived
Token ID5615
Feature activation-3.87e-02

Pos logit contributions
suits+0.021
utes+0.021
nes+0.020
J+0.020
 wrist+0.019
Neg logit contributions
ocol-0.033
elling-0.029
arter-0.029
 wolves-0.029
 wolf-0.029
Token a
Token ID257
Feature activation-3.94e-02

Pos logit contributions
suits+0.051
utes+0.048
nes+0.045
 wrist+0.043
J+0.043
Neg logit contributions
ocol-0.095
 wolves-0.088
 wolf-0.086
 hunger-0.085
elling-0.085
Token long
Token ID890
Feature activation-2.87e-02

Pos logit contributions
suits+0.077
utes+0.073
nes+0.071
J+0.070
oe+0.068
Neg logit contributions
ocol-0.072
 wolves-0.064
 wolf-0.064
 worries-0.062
 hunger-0.062
Token time
Token ID640
Feature activation-6.44e-02

Pos logit contributions
suits+0.050
utes+0.048
nes+0.047
J+0.046
oe+0.044
Neg logit contributions
ocol-0.058
 wolves-0.052
 wolf-0.052
 hunger-0.051
 worries-0.051
Token ago
Token ID2084
Feature activation-8.73e-02

Pos logit contributions
suits+0.109
utes+0.104
nes+0.100
J+0.098
oe+0.097
Neg logit contributions
ocol-0.135
 wolves-0.120
elling-0.120
 wolf-0.119
arter-0.117
Token.
Token ID13
Feature activation+4.17e-03

Pos logit contributions
suits+0.139
utes+0.136
nes+0.129
J+0.129
Anna+0.125
Neg logit contributions
ocol-0.184
 wolves-0.166
 hunger-0.163
 worries-0.163
 wolf-0.163
Token
Token ID220
Feature activation+2.14e-02

Pos logit contributions
ocol+0.009
 wolves+0.008
elling+0.008
 wolf+0.008
arter+0.008
Neg logit contributions
suits-0.007
utes-0.007
nes-0.006
 wrist-0.006
J-0.006
Token
Token ID198
Feature activation+7.63e-03

Pos logit contributions
ocol+0.050
opl+0.046
elling+0.046
 wolves+0.045
arter+0.045
Neg logit contributions
suits-0.029
utes-0.025
 wrist-0.025
nes-0.025
suit-0.024
Token
Token ID198
Feature activation+1.67e-02

Pos logit contributions
ocol+0.012
elling+0.012
opl+0.012
 wolf+0.012
 wolves+0.012
Neg logit contributions
 wrist-0.013
suits-0.013
utes-0.013
nes-0.012
suit-0.012
TokenL
Token ID43
Feature activation-2.47e-02

Pos logit contributions
ocol+0.038
elling+0.037
 accidents+0.036
 awful+0.036
 wolves+0.036
Neg logit contributions
suits-0.022
utes-0.021
 wrist-0.020
nes-0.018
 hips-0.017
Token,
Token ID11
Feature activation-4.50e-02

Pos logit contributions
suits+0.002
utes+0.002
nes+0.002
J+0.002
 wrist+0.002
Neg logit contributions
ocol-0.003
 wolves-0.003
 wolf-0.003
 worries-0.003
elling-0.003
Token
Token ID6184
Feature activation-2.98e-02

Pos logit contributions
suits+0.093
utes+0.088
nes+0.086
pack+0.081
 wrist+0.081
Neg logit contributions
ocol-0.075
 wolf-0.069
 wolves-0.065
elling-0.063
 hunger-0.063
Token
Token ID95
Feature activation-5.47e-02

Pos logit contributions
suits+0.050
utes+0.048
J+0.044
nes+0.044
 wrist+0.042
Neg logit contributions
ocol-0.066
elling-0.056
arter-0.054
 wolf-0.054
opl-0.054
Token
Token ID26391
Feature activation-4.16e-02

Pos logit contributions
suits+0.096
utes+0.093
oe+0.083
J+0.082
nes+0.082
Neg logit contributions
ocol-0.112
opl-0.102
elling-0.102
 wolves-0.095
 wolf-0.095
Token
Token ID129
Feature activation-1.86e-02

Pos logit contributions
utes+0.064
suits+0.064
nes+0.061
oe+0.059
J+0.057
Neg logit contributions
ocol-0.082
opl-0.080
 worries-0.078
 wolf-0.077
 wolves-0.077
Token
Token ID241
Feature activation-8.71e-02

Pos logit contributions
suits+0.038
utes+0.036
J+0.035
nes+0.035
oe+0.034
Neg logit contributions
ocol-0.033
elling-0.029
opl-0.029
 wolves-0.029
 wolf-0.028
TokenNo
Token ID2949
Feature activation+1.52e-04

Pos logit contributions
suits+0.128
utes+0.123
nes+0.114
 wrist+0.109
pack+0.106
Neg logit contributions
ocol-0.192
 wolves-0.171
 worries-0.171
 wolf-0.171
opl-0.169
Token,
Token ID11
Feature activation-5.13e-02

Pos logit contributions
ocol+0.000
 wolves+0.000
elling+0.000
 wolf+0.000
arter+0.000
Neg logit contributions
suits-0.000
utes-0.000
nes-0.000
J-0.000
oe-0.000
Token Marcus
Token ID17068
Feature activation-1.22e-03

Pos logit contributions
suits+0.082
utes+0.078
nes+0.075
J+0.074
oe+0.072
Neg logit contributions
ocol-0.112
 wolves-0.101
elling-0.099
 wolf-0.099
 worries-0.098
Token.
Token ID13
Feature activation-7.11e-03

Pos logit contributions
suits+0.002
utes+0.002
nes+0.002
J+0.002
oe+0.002
Neg logit contributions
ocol-0.003
 wolves-0.002
elling-0.002
 wolf-0.002
arter-0.002
Token It
Token ID632
Feature activation+1.06e-02

Pos logit contributions
suits+0.014
utes+0.013
nes+0.013
J+0.013
oe+0.013
Neg logit contributions
ocol-0.013
 wolves-0.011
elling-0.011
 wolf-0.011
 worries-0.011
Token to
Token ID284
Feature activation+3.51e-05

Pos logit contributions
suits+0.001
utes+0.001
J+0.001
nes+0.001
oe+0.001
Neg logit contributions
ocol-0.003
arter-0.003
elling-0.003
opl-0.003
 wolves-0.003
Token prevent
Token ID2948
Feature activation+2.67e-02

Pos logit contributions
ocol+0.000
 wolves+0.000
elling+0.000
 wolf+0.000
 worries+0.000
Neg logit contributions
suits-0.000
utes-0.000
nes-0.000
J-0.000
oe-0.000
Token the
Token ID262
Feature activation-1.40e-02

Pos logit contributions
ocol+0.053
elling+0.047
arter+0.047
 wolves+0.045
 wolf+0.044
Neg logit contributions
utes-0.043
nes-0.043
suits-0.043
J-0.042
oe-0.042
Token bad
Token ID2089
Feature activation+5.03e-03

Pos logit contributions
suits+0.027
utes+0.026
nes+0.025
oe+0.025
J+0.024
Neg logit contributions
ocol-0.026
elling-0.023
arter-0.022
opl-0.021
 wolves-0.021
Token guys
Token ID3730
Feature activation-1.93e-02

Pos logit contributions
ocol+0.010
elling+0.008
arter+0.008
opl+0.008
 hunger+0.007
Neg logit contributions
suits-0.010
utes-0.009
nes-0.009
oe-0.009
 wrist-0.009
Token (
Token ID357
Feature activation-8.70e-02

Pos logit contributions
suits+0.031
utes+0.030
J+0.028
nes+0.027
 wrist+0.027
Neg logit contributions
ocol-0.042
 wolves-0.038
 wolf-0.037
arter-0.037
 worries-0.036
Tokenthe
Token ID1169
Feature activation-2.58e-02

Pos logit contributions
suits+0.132
utes+0.122
 wrist+0.119
nes+0.111
 hips+0.104
Neg logit contributions
ocol-0.196
elling-0.174
 wolves-0.173
opl-0.171
 destruction-0.169
Token toys
Token ID14958
Feature activation-1.49e-02

Pos logit contributions
suits+0.049
utes+0.047
nes+0.046
oe+0.045
J+0.044
Neg logit contributions
ocol-0.049
elling-0.043
arter-0.042
opl-0.040
 wolves-0.040
Token on
Token ID319
Feature activation+1.31e-02

Pos logit contributions
suits+0.022
utes+0.020
nes+0.020
J+0.019
oe+0.019
Neg logit contributions
ocol-0.035
elling-0.031
 wolves-0.031
arter-0.031
 wolf-0.031
Token the
Token ID262
Feature activation-2.72e-02

Pos logit contributions
ocol+0.029
elling+0.026
arter+0.026
 wolves+0.025
 worries+0.025
Neg logit contributions
suits-0.020
utes-0.020
nes-0.019
J-0.019
oe-0.018
Token floor
Token ID4314
Feature activation-2.17e-02

Pos logit contributions
suits+0.046
utes+0.044
nes+0.042
J+0.041
oe+0.040
Neg logit contributions
ocol-0.057
 wolves-0.052
 wolf-0.051
elling-0.051
 worries-0.051
Token,
Token ID11
Feature activation-4.87e-02

Pos logit contributions
suits+0.005
utes+0.005
nes+0.004
J+0.004
oe+0.004
Neg logit contributions
ocol-0.007
 wolves-0.006
elling-0.006
 wolf-0.006
 worries-0.006
Token
Token ID6184
Feature activation-2.71e-02

Pos logit contributions
suits+0.093
utes+0.088
nes+0.086
J+0.082
 wrist+0.081
Neg logit contributions
ocol-0.091
 wolf-0.082
 wolves-0.080
 hunger-0.078
elling-0.077
Token
Token ID95
Feature activation-5.10e-02

Pos logit contributions
suits+0.047
utes+0.044
J+0.041
nes+0.040
 wrist+0.039
Neg logit contributions
ocol-0.058
elling-0.049
 wolf-0.049
arter-0.048
 earthquake-0.048
Token
Token ID26391
Feature activation-4.20e-02

Pos logit contributions
suits+0.088
utes+0.086
J+0.075
nes+0.075
oe+0.074
Neg logit contributions
ocol-0.105
elling-0.096
opl-0.095
 wolf-0.091
 wolves-0.091
Token
Token ID129
Feature activation-1.69e-02

Pos logit contributions
utes+0.068
suits+0.063
nes+0.060
pack+0.059
oe+0.058
Neg logit contributions
opl-0.078
 wolf-0.077
ocol-0.076
 shouldn-0.076
 worries-0.076
Token
Token ID241
Feature activation-8.69e-02

Pos logit contributions
suits+0.034
utes+0.032
J+0.031
nes+0.031
oe+0.030
Neg logit contributions
ocol-0.031
opl-0.027
elling-0.027
 wolves-0.026
 wolf-0.026
TokenLet
Token ID5756
Feature activation-2.03e-02

Pos logit contributions
suits+0.121
utes+0.115
nes+0.104
 wrist+0.101
pack+0.099
Neg logit contributions
ocol-0.201
 wolves-0.183
 wolf-0.183
 worries-0.180
 hunger-0.179
Tokenâ
Token ID22940
Feature activation-1.60e-02

Pos logit contributions
suits+0.038
utes+0.037
nes+0.034
 wrist+0.034
J+0.033
Neg logit contributions
ocol-0.039
 wolves-0.035
 wolf-0.035
elling-0.035
 worries-0.035
Token
Token ID26391
Feature activation-4.81e-02

Pos logit contributions
suits+0.029
utes+0.029
J+0.024
nes+0.024
 wrist+0.024
Neg logit contributions
ocol-0.032
opl-0.029
elling-0.029
 earthquake-0.028
 worries-0.027
Token
Token ID8151
Feature activation-1.09e-02

Pos logit contributions
utes+0.081
suits+0.072
oe+0.071
J+0.069
nes+0.069
Neg logit contributions
 wolves-0.086
ocol-0.084
 wolf-0.084
 worries-0.084
opl-0.083
Tokens
Token ID82
Feature activation-2.47e-02

Pos logit contributions
utes+0.019
suits+0.019
 wrist+0.018
nes+0.017
pack+0.016
Neg logit contributions
ocol-0.021
elling-0.019
opl-0.018
arter-0.018
 wolves-0.018
Token it
Token ID340
Feature activation+4.74e-05

Pos logit contributions
suits+0.036
utes+0.034
nes+0.033
J+0.032
oe+0.032
Neg logit contributions
ocol-0.056
 wolves-0.050
elling-0.050
arter-0.050
 worries-0.050
Token here
Token ID994
Feature activation-3.14e-02

Pos logit contributions
ocol+0.000
 wolves+0.000
elling+0.000
 wolf+0.000
 worries+0.000
Neg logit contributions
suits-0.000
utes-0.000
nes-0.000
oe-0.000
J-0.000
Token a
Token ID257
Feature activation-5.58e-02

Pos logit contributions
suits+0.047
utes+0.045
J+0.043
nes+0.043
oe+0.041
Neg logit contributions
ocol-0.070
 wolves-0.064
 wolf-0.063
elling-0.063
 worries-0.062
Token long
Token ID890
Feature activation-3.89e-02

Pos logit contributions
suits+0.105
utes+0.100
nes+0.097
J+0.095
oe+0.094
Neg logit contributions
ocol-0.107
 wolves-0.095
elling-0.094
 worries-0.092
 wolf-0.092
Token time
Token ID640
Feature activation-6.95e-02

Pos logit contributions
suits+0.068
utes+0.067
nes+0.064
J+0.063
oe+0.060
Neg logit contributions
ocol-0.079
 wolf-0.070
 wolves-0.070
 hunger-0.069
 worries-0.068
Token ago
Token ID2084
Feature activation-8.67e-02

Pos logit contributions
suits+0.134
utes+0.131
J+0.125
nes+0.125
 wrist+0.121
Neg logit contributions
ocol-0.127
 wolves-0.112
 worries-0.109
elling-0.109
 wolf-0.109
Token."
Token ID526
Feature activation-1.86e-02

Pos logit contributions
utes+0.150
suits+0.147
J+0.145
nes+0.142
Anna+0.135
Neg logit contributions
ocol-0.167
 wolves-0.147
 worries-0.144
 wolf-0.143
 hunger-0.143
Token
Token ID198
Feature activation-5.60e-03

Pos logit contributions
suits+0.026
nes+0.025
utes+0.025
 wrist+0.022
oe+0.022
Neg logit contributions
ocol-0.043
 wolves-0.041
 wolf-0.040
 worries-0.039
 hunger-0.039
Token
Token ID198
Feature activation+4.89e-03

Pos logit contributions
suits+0.009
nes+0.008
 wrist+0.008
suit+0.008
utes+0.007
Neg logit contributions
ocol-0.012
 wolves-0.011
 wolf-0.011
opl-0.011
 ow-0.011
TokenThey
Token ID2990
Feature activation-3.33e-02

Pos logit contributions
 wolves+0.010
ocol+0.010
 wolf+0.010
 dangers+0.010
 poison+0.009
Neg logit contributions
suits-0.007
 wrist-0.007
nes-0.007
suit-0.007
utes-0.006
Token decided
Token ID3066
Feature activation+5.37e-03

Pos logit contributions
utes+0.059
suits+0.058
nes+0.054
oe+0.053
 wrist+0.052
Neg logit contributions
ocol-0.064
 wolves-0.059
 worries-0.058
 wolf-0.058
 hunger-0.058
Token is
Token ID318
Feature activation-2.14e-02

Pos logit contributions
ocol+0.040
arter+0.034
 earthquake+0.034
 hunger+0.034
 wolves+0.033
Neg logit contributions
suits-0.023
oe-0.021
They-0.020
a-0.020
Anna-0.020
Token from
Token ID422
Feature activation-5.88e-03

Pos logit contributions
suits+0.026
utes+0.025
oe+0.023
 wrist+0.023
ends+0.023
Neg logit contributions
ocol-0.052
elling-0.049
opl-0.048
 wolves-0.046
arter-0.046
Token a
Token ID257
Feature activation-5.31e-02

Pos logit contributions
utes+0.008
suits+0.008
 wrist+0.007
oe+0.007
nes+0.007
Neg logit contributions
ocol-0.013
elling-0.013
arter-0.012
 wolves-0.012
opl-0.012
Token long
Token ID890
Feature activation-3.42e-02

Pos logit contributions
suits+0.100
utes+0.094
J+0.091
nes+0.090
oe+0.090
Neg logit contributions
ocol-0.103
elling-0.092
 wolves-0.089
 worries-0.087
arter-0.087
Token time
Token ID640
Feature activation-6.42e-02

Pos logit contributions
suits+0.059
utes+0.057
nes+0.055
J+0.054
oe+0.052
Neg logit contributions
ocol-0.071
 wolves-0.064
 wolf-0.063
 hunger-0.062
 worries-0.062
Token ago
Token ID2084
Feature activation-8.67e-02

Pos logit contributions
suits+0.116
utes+0.111
nes+0.108
J+0.106
oe+0.104
Neg logit contributions
ocol-0.127
 wolves-0.113
elling-0.110
 wolf-0.110
 worries-0.109
Token,
Token ID11
Feature activation-4.34e-02

Pos logit contributions
utes+0.149
J+0.147
suits+0.147
nes+0.142
Anna+0.136
Neg logit contributions
ocol-0.165
 wolves-0.148
 worries-0.143
 wolf-0.143
 hunger-0.143
Token when
Token ID618
Feature activation-2.42e-02

Pos logit contributions
suits+0.066
utes+0.062
nes+0.060
J+0.059
 wrist+0.058
Neg logit contributions
ocol-0.099
 wolves-0.088
elling-0.088
 wolf-0.087
arter-0.087
Token people
Token ID661
Feature activation-7.13e-03

Pos logit contributions
suits+0.027
utes+0.026
nes+0.024
 wrist+0.023
J+0.023
Neg logit contributions
ocol-0.064
opl-0.059
elling-0.059
arter-0.058
 wolves-0.056
Token lived
Token ID5615
Feature activation-2.28e-02

Pos logit contributions
suits+0.012
utes+0.011
nes+0.011
oe+0.010
 wrist+0.010
Neg logit contributions
ocol-0.015
 wolves-0.014
 wolf-0.014
elling-0.014
arter-0.014
Token here
Token ID994
Feature activation-1.91e-02

Pos logit contributions
suits+0.032
utes+0.031
nes+0.031
J+0.030
 wrist+0.030
Neg logit contributions
ocol-0.053
 wolves-0.048
 wolf-0.048
 hunger-0.046
elling-0.046
Token top
Token ID1353
Feature activation+3.71e-02

Pos logit contributions
ocol+0.016
 wolves+0.014
elling+0.014
 wolf+0.014
 worries+0.014
Neg logit contributions
suits-0.014
utes-0.013
nes-0.013
J-0.013
oe-0.012
Token,
Token ID11
Feature activation-1.05e-02

Pos logit contributions
ocol+0.073
 wolves+0.064
elling+0.063
 wolf+0.063
arter+0.063
Neg logit contributions
suits-0.068
utes-0.065
oe-0.064
nes-0.063
J-0.061
Token Tom
Token ID4186
Feature activation+6.43e-03

Pos logit contributions
suits+0.016
utes+0.015
nes+0.015
J+0.014
oe+0.014
Neg logit contributions
ocol-0.024
 wolves-0.021
elling-0.021
 wolf-0.021
arter-0.021
Token stopped
Token ID5025
Feature activation+1.23e-02

Pos logit contributions
ocol+0.015
 wolves+0.013
arter+0.013
elling+0.013
 wolf+0.013
Neg logit contributions
suits-0.010
utes-0.009
nes-0.009
J-0.009
oe-0.009
Token to
Token ID284
Feature activation+1.10e-02

Pos logit contributions
ocol+0.030
elling+0.029
 wolves+0.027
arter+0.027
 worries+0.027
Neg logit contributions
utes-0.015
suits-0.015
 wrist-0.014
J-0.014
oe-0.014
Token look
Token ID804
Feature activation+7.64e-02

Pos logit contributions
ocol+0.027
elling+0.025
 wolves+0.025
arter+0.024
 wolf+0.024
Neg logit contributions
suits-0.014
 wrist-0.013
utes-0.013
nes-0.013
J-0.013
Token around
Token ID1088
Feature activation+5.30e-03

Pos logit contributions
ocol+0.171
opl+0.157
 wolves+0.157
arter+0.155
elling+0.152
Neg logit contributions
suits-0.119
utes-0.107
oe-0.102
J-0.101
nes-0.098
Token.
Token ID13
Feature activation+2.00e-02

Pos logit contributions
ocol+0.013
elling+0.012
 wolves+0.012
arter+0.012
 worries+0.012
Neg logit contributions
suits-0.007
utes-0.006
J-0.006
nes-0.006
oe-0.006
Token He
Token ID679
Feature activation+9.81e-04

Pos logit contributions
ocol+0.036
 wolves+0.031
elling+0.031
 wolf+0.031
arter+0.030
Neg logit contributions
suits-0.040
utes-0.038
nes-0.037
J-0.037
oe-0.036
Token could
Token ID714
Feature activation-2.94e-02

Pos logit contributions
ocol+0.002
arter+0.002
elling+0.002
 wolves+0.002
opl+0.002
Neg logit contributions
suits-0.001
utes-0.001
J-0.001
nes-0.001
 wrist-0.001
Token see
Token ID766
Feature activation+3.67e-02

Pos logit contributions
suits+0.053
utes+0.048
J+0.046
oe+0.044
nes+0.043
Neg logit contributions
ocol-0.057
 hunger-0.055
 wolf-0.055
 worries-0.055
 wolves-0.054
Token loved
Token ID6151
Feature activation-1.66e-02

Pos logit contributions
ocol+0.052
 wolves+0.046
elling+0.046
arter+0.045
 wolf+0.045
Neg logit contributions
suits-0.046
utes-0.043
nes-0.042
J-0.041
oe-0.041
Token to
Token ID284
Feature activation+2.17e-02

Pos logit contributions
utes+0.012
suits+0.011
 wrist+0.010
J+0.010
pack+0.009
Neg logit contributions
ocol-0.050
arter-0.046
elling-0.046
opl-0.044
 worries-0.044
Token play
Token ID711
Feature activation-3.46e-02

Pos logit contributions
ocol+0.053
elling+0.048
 wolves+0.048
 worries+0.047
 wolf+0.047
Neg logit contributions
suits-0.029
utes-0.028
 wrist-0.027
nes-0.026
J-0.025
Token outside
Token ID2354
Feature activation-5.33e-03

Pos logit contributions
suits+0.055
utes+0.050
nes+0.048
oe+0.048
J+0.046
Neg logit contributions
ocol-0.076
 wolf-0.072
 wolves-0.071
 ac-0.067
 emergency-0.067
Token and
Token ID290
Feature activation+3.01e-02

Pos logit contributions
suits+0.008
utes+0.007
nes+0.007
J+0.007
 wrist+0.006
Neg logit contributions
ocol-0.013
 wolves-0.012
 wolf-0.012
elling-0.011
 hunger-0.011
Token feel
Token ID1254
Feature activation+7.26e-02

Pos logit contributions
ocol+0.078
elling+0.071
arter+0.069
 wolves+0.067
opl+0.066
Neg logit contributions
suits-0.037
 wrist-0.036
utes-0.034
nes-0.033
But-0.033
Token the
Token ID262
Feature activation+2.94e-02

Pos logit contributions
ocol+0.187
elling+0.176
arter+0.176
opl+0.170
 worries+0.168
Neg logit contributions
suits-0.071
They-0.069
 fast-0.068
utes-0.067
J-0.067
Token warm
Token ID5814
Feature activation+3.29e-02

Pos logit contributions
ocol+0.060
arter+0.052
elling+0.051
 wolves+0.050
 hunger+0.049
Neg logit contributions
suits-0.053
utes-0.049
J-0.048
 wrist-0.048
nes-0.048
Token sun
Token ID4252
Feature activation-2.12e-03

Pos logit contributions
ocol+0.078
arter+0.067
 shouldn+0.066
 destruction+0.065
 obstacle+0.064
Neg logit contributions
suits-0.048
oe-0.043
 wrist-0.043
Mom-0.043
suit-0.041
Token on
Token ID319
Feature activation+2.67e-02

Pos logit contributions
suits+0.003
utes+0.003
nes+0.003
oe+0.003
J+0.003
Neg logit contributions
ocol-0.005
 wolves-0.005
elling-0.005
arter-0.005
 worries-0.005
Token her
Token ID607
Feature activation-1.03e-02

Pos logit contributions
ocol+0.054
 wolves+0.048
elling+0.047
 worries+0.047
 wolf+0.047
Neg logit contributions
suits-0.046
utes-0.045
J-0.043
nes-0.043
 wrist-0.042
Token store
Token ID3650
Feature activation-3.26e-03

Pos logit contributions
suits+0.010
utes+0.009
oe+0.008
nes+0.008
J+0.008
Neg logit contributions
ocol-0.017
 wolves-0.015
elling-0.015
 worries-0.015
arter-0.015
Token,
Token ID11
Feature activation-6.29e-04

Pos logit contributions
suits+0.005
utes+0.005
nes+0.005
J+0.005
 wrist+0.005
Neg logit contributions
ocol-0.007
 wolves-0.006
 wolf-0.006
elling-0.006
 worries-0.006
Token Sam
Token ID3409
Feature activation+6.70e-03

Pos logit contributions
suits+0.001
utes+0.001
nes+0.001
J+0.001
oe+0.001
Neg logit contributions
ocol-0.001
 wolves-0.001
elling-0.001
 wolf-0.001
arter-0.001
Token liked
Token ID8288
Feature activation-3.71e-02

Pos logit contributions
ocol+0.014
 wolves+0.013
elling+0.012
arter+0.012
 wolf+0.012
Neg logit contributions
suits-0.011
utes-0.011
nes-0.011
J-0.010
oe-0.010
Token to
Token ID284
Feature activation+5.76e-03

Pos logit contributions
suits+0.041
utes+0.036
J+0.035
 wrist+0.034
oe+0.034
Neg logit contributions
ocol-0.098
arter-0.092
elling-0.092
opl-0.088
 worries-0.088
Token look
Token ID804
Feature activation+7.07e-02

Pos logit contributions
ocol+0.014
 wolves+0.013
elling+0.012
arter+0.012
 wolf+0.012
Neg logit contributions
suits-0.008
nes-0.007
utes-0.007
 wrist-0.007
J-0.007
Token at
Token ID379
Feature activation+1.72e-02

Pos logit contributions
ocol+0.158
opl+0.153
arter+0.150
 wolves+0.146
elling+0.143
Neg logit contributions
suits-0.109
utes-0.094
oe-0.091
J-0.088
pack-0.085
Token all
Token ID477
Feature activation+3.69e-02

Pos logit contributions
ocol+0.043
arter+0.042
elling+0.040
opl+0.039
 earthquake+0.039
Neg logit contributions
suits-0.019
oe-0.017
 wrist-0.017
 first-0.016
pack-0.016
Token the
Token ID262
Feature activation+2.03e-02

Pos logit contributions
ocol+0.085
arter+0.079
 hunger+0.079
 worries+0.079
 wolf+0.077
Neg logit contributions
suits-0.051
utes-0.047
oe-0.047
J-0.047
They-0.045
Token different
Token ID1180
Feature activation-9.27e-03

Pos logit contributions
arter+0.041
ocol+0.041
opl+0.039
 hunger+0.038
elling+0.038
Neg logit contributions
suits-0.038
oe-0.030
J-0.030
utes-0.029
They-0.029
Token things
Token ID1243
Feature activation-3.80e-03

Pos logit contributions
suits+0.017
They+0.013
j+0.013
J+0.013
utes+0.013
Neg logit contributions
arter-0.020
ocol-0.019
opl-0.018
 ow-0.018
 hunger-0.018
Token
Token ID198
Feature activation+1.12e-03

Pos logit contributions
ocol+0.030
elling+0.028
arter+0.027
 hunger+0.026
 wolves+0.026
Neg logit contributions
nes-0.027
suits-0.026
 wrist-0.024
z-0.024
utes-0.024
Token
Token ID198
Feature activation+7.56e-03

Pos logit contributions
ocol+0.001
 shouldn+0.001
elling+0.001
 destruction+0.001
 wolves+0.001
Neg logit contributions
 wrist-0.003
suit-0.002
ault-0.002
 wrists-0.002
suits-0.002
Token"
Token ID1
Feature activation-5.49e-03

Pos logit contributions
ocol+0.017
elling+0.015
 shouldn+0.014
 emergency+0.014
 wolves+0.014
Neg logit contributions
 wrist-0.011
suits-0.009
 hips-0.009
suit-0.009
nes-0.009
TokenWow
Token ID22017
Feature activation-9.68e-03

Pos logit contributions
utes+0.009
suits+0.009
nes+0.008
 wrist+0.008
oe+0.008
Neg logit contributions
ocol-0.012
elling-0.010
 wolves-0.010
 wolf-0.010
 hunger-0.010
Token,
Token ID11
Feature activation-1.87e-02

Pos logit contributions
suits+0.019
utes+0.019
nes+0.018
J+0.018
 wrist+0.018
Neg logit contributions
ocol-0.017
 worries-0.015
 wolves-0.015
 wolf-0.015
elling-0.014
Token look
Token ID804
Feature activation+6.89e-02

Pos logit contributions
suits+0.029
utes+0.027
nes+0.026
J+0.026
oe+0.025
Neg logit contributions
ocol-0.042
 wolves-0.038
 wolf-0.038
elling-0.038
arter-0.037
Token at
Token ID379
Feature activation+1.30e-02

Pos logit contributions
ocol+0.127
 wolves+0.111
elling+0.109
 worries+0.109
 wolf+0.109
Neg logit contributions
suits-0.135
utes-0.130
nes-0.127
J-0.124
 wrist-0.122
Token that
Token ID326
Feature activation-4.61e-03

Pos logit contributions
ocol+0.030
 wolves+0.026
 hunger+0.026
 wolf+0.026
opl+0.026
Neg logit contributions
utes-0.019
suits-0.018
nes-0.017
oe-0.016
J-0.016
Token!"
Token ID2474
Feature activation-2.17e-02

Pos logit contributions
suits+0.008
utes+0.007
nes+0.007
J+0.007
oe+0.007
Neg logit contributions
ocol-0.010
 wolves-0.009
elling-0.009
 wolf-0.009
arter-0.009
Token Jill
Token ID22772
Feature activation-6.08e-03

Pos logit contributions
suits+0.041
utes+0.039
nes+0.038
 wrist+0.037
z+0.037
Neg logit contributions
ocol-0.041
 worries-0.037
elling-0.036
 hunger-0.036
 wolves-0.036
Token said
Token ID531
Feature activation+1.90e-04

Pos logit contributions
utes+0.012
suits+0.011
J+0.011
oe+0.011
nes+0.011
Neg logit contributions
ocol-0.010
 worries-0.010
 wolves-0.009
elling-0.009
 wolf-0.009
Token pasta
Token ID26296
Feature activation+1.77e-02

Pos logit contributions
suits+0.004
utes+0.004
nes+0.003
J+0.003
oe+0.003
Neg logit contributions
ocol-0.004
 wolves-0.004
elling-0.004
 wolf-0.004
arter-0.004
Token was
Token ID373
Feature activation+1.63e-02

Pos logit contributions
ocol+0.041
elling+0.036
 wolves+0.036
arter+0.035
opl+0.035
Neg logit contributions
suits-0.028
oe-0.025
J-0.025
nes-0.025
utes-0.025
Token ready
Token ID3492
Feature activation+1.55e-02

Pos logit contributions
ocol+0.035
 wolves+0.032
elling+0.031
 wolf+0.031
 worries+0.031
Neg logit contributions
suits-0.027
utes-0.026
nes-0.025
J-0.024
oe-0.024
Token,
Token ID11
Feature activation-5.00e-03

Pos logit contributions
ocol+0.036
 wolves+0.032
elling+0.032
 wolf+0.032
arter+0.032
Neg logit contributions
suits-0.023
utes-0.021
nes-0.021
J-0.020
oe-0.020
Token it
Token ID340
Feature activation+3.56e-02

Pos logit contributions
suits+0.007
utes+0.007
J+0.007
nes+0.006
oe+0.006
Neg logit contributions
ocol-0.012
elling-0.011
 wolves-0.011
arter-0.011
opl-0.010
Token smelled
Token ID37609
Feature activation+6.83e-02

Pos logit contributions
ocol+0.086
opl+0.076
arter+0.075
elling+0.075
 wolves+0.073
Neg logit contributions
suits-0.053
nes-0.050
oe-0.049
J-0.049
Anna-0.047
Token so
Token ID523
Feature activation+2.70e-02

Pos logit contributions
ocol+0.155
opl+0.141
arter+0.140
 wolves+0.138
elling+0.138
Neg logit contributions
suits-0.098
J-0.094
They-0.094
oe-0.090
But-0.087
Token sweet
Token ID6029
Feature activation+2.39e-02

Pos logit contributions
ocol+0.060
opl+0.057
 wolves+0.057
 wolf+0.055
elling+0.054
Neg logit contributions
suits-0.041
oe-0.039
J-0.038
utes-0.036
They-0.036
Token and
Token ID290
Feature activation-2.28e-03

Pos logit contributions
ocol+0.065
opl+0.058
arter+0.058
 wolves+0.057
 hunger+0.056
Neg logit contributions
suits-0.025
They-0.023
oe-0.022
J-0.022
ana-0.021
Token delicious
Token ID12625
Feature activation+1.89e-02

Pos logit contributions
suits+0.004
J+0.003
oe+0.003
 wrist+0.003
Mom+0.003
Neg logit contributions
ocol-0.005
opl-0.005
arter-0.004
elling-0.004
 wolves-0.004
Token that
Token ID326
Feature activation-7.82e-03

Pos logit contributions
ocol+0.045
 wolves+0.040
elling+0.040
arter+0.040
opl+0.039
Neg logit contributions
suits-0.027
utes-0.024
J-0.024
oe-0.024
nes-0.023
Token to
Token ID284
Feature activation-2.45e-04

Pos logit contributions
suits+0.016
nes+0.015
utes+0.015
oe+0.014
 wrist+0.014
Neg logit contributions
ocol-0.032
 wolves-0.028
elling-0.028
 wolf-0.028
arter-0.028
Token protect
Token ID1805
Feature activation+4.30e-03

Pos logit contributions
 wrist+0.000
pack+0.000
J+0.000
suits+0.000
They+0.000
Neg logit contributions
ocol-0.001
elling-0.001
 wolves-0.001
 worries-0.001
arter-0.001
Token them
Token ID606
Feature activation-2.00e-03

Pos logit contributions
ocol+0.009
elling+0.008
arter+0.008
opl+0.008
 worries+0.008
Neg logit contributions
suits-0.007
utes-0.007
nes-0.007
J-0.007
oe-0.006
Token.
Token ID13
Feature activation-1.03e-02

Pos logit contributions
suits+0.003
utes+0.002
nes+0.002
oe+0.002
 wrist+0.002
Neg logit contributions
ocol-0.005
elling-0.004
 wolves-0.004
 wolf-0.004
arter-0.004
Token<|endoftext|>
Token ID50256
Feature activation-3.36e-02

Pos logit contributions
suits+0.009
utes+0.008
nes+0.007
J+0.007
oe+0.007
Neg logit contributions
ocol-0.031
 wolves-0.028
elling-0.028
 worries-0.028
 wolf-0.028
TokenOnce
Token ID7454
Feature activation+1.58e-02

Pos logit contributions
utes+0.053
suits+0.051
nes+0.050
 wrist+0.050
pack+0.043
Neg logit contributions
ocol-0.067
 wolves-0.066
 wolf-0.066
opl-0.063
 destruction-0.062
Token upon
Token ID2402
Feature activation-2.78e-02

Pos logit contributions
ocol+0.033
 wolves+0.030
 wolf+0.029
arter+0.029
 worries+0.029
Neg logit contributions
suits-0.027
utes-0.025
nes-0.025
 wrist-0.024
J-0.024
Token a
Token ID257
Feature activation+4.90e-04

Pos logit contributions
suits+0.055
utes+0.054
oe+0.050
nes+0.049
suit+0.048
Neg logit contributions
 wolves-0.048
ocol-0.046
 wolf-0.046
 hunger-0.046
 ow-0.045
Token time
Token ID640
Feature activation-5.56e-03

Pos logit contributions
ocol+0.001
 hunger+0.001
 worries+0.001
 wolves+0.001
 wolf+0.001
Neg logit contributions
utes-0.001
nes-0.001
 wrist-0.001
suits-0.001
J-0.001
Token,
Token ID11
Feature activation+1.06e-02

Pos logit contributions
suits+0.011
utes+0.011
nes+0.010
 wrist+0.010
J+0.010
Neg logit contributions
ocol-0.010
 wolves-0.009
 worries-0.009
arter-0.009
 wolf-0.009
Token there
Token ID612
Feature activation+6.66e-03

Pos logit contributions
ocol+0.025
 wolves+0.023
 wolf+0.022
arter+0.022
 hunger+0.022
Neg logit contributions
suits-0.015
utes-0.015
nes-0.014
 wrist-0.014
J-0.013
Token to
Token ID284
Feature activation+1.24e-02

Pos logit contributions
suits+0.026
utes+0.025
nes+0.024
J+0.023
oe+0.023
Neg logit contributions
ocol-0.048
 wolves-0.044
elling-0.043
 wolf-0.043
arter-0.043
Token play
Token ID711
Feature activation-6.07e-02

Pos logit contributions
ocol+0.026
 wolves+0.023
 worries+0.023
arter+0.023
 hunger+0.023
Neg logit contributions
suits-0.021
utes-0.020
nes-0.019
J-0.019
oe-0.018
Token with
Token ID351
Feature activation-4.88e-03

Pos logit contributions
suits+0.069
utes+0.065
nes+0.059
J+0.058
oe+0.056
Neg logit contributions
ocol-0.161
 wolves-0.148
 wolf-0.145
 worries-0.144
elling-0.143
Token.
Token ID13
Feature activation-8.78e-03

Pos logit contributions
suits+0.007
utes+0.006
nes+0.006
J+0.006
oe+0.006
Neg logit contributions
ocol-0.012
elling-0.011
 wolves-0.011
arter-0.011
 wolf-0.011
Token When
Token ID1649
Feature activation-2.90e-02

Pos logit contributions
suits+0.016
utes+0.014
nes+0.014
oe+0.014
J+0.014
Neg logit contributions
ocol-0.017
opl-0.015
 wolves-0.015
 worries-0.015
elling-0.015
Token she
Token ID673
Feature activation+2.08e-02

Pos logit contributions
suits+0.052
utes+0.048
nes+0.046
 wrist+0.044
J+0.044
Neg logit contributions
ocol-0.058
 wolves-0.054
 worries-0.053
 wolf-0.053
 hunger-0.052
Token was
Token ID373
Feature activation+1.30e-02

Pos logit contributions
ocol+0.047
 wolves+0.043
elling+0.042
arter+0.042
 wolf+0.042
Neg logit contributions
suits-0.033
J-0.031
nes-0.031
oe-0.030
utes-0.030
Token done
Token ID1760
Feature activation+2.16e-02

Pos logit contributions
ocol+0.022
 worries+0.019
 wolves+0.019
 hunger+0.019
 wolf+0.018
Neg logit contributions
suits-0.028
utes-0.027
 wrist-0.025
nes-0.025
J-0.024
Token playing
Token ID2712
Feature activation-4.18e-02

Pos logit contributions
ocol+0.045
 wolves+0.039
 wolf+0.039
 worries+0.038
 hunger+0.038
Neg logit contributions
suits-0.039
utes-0.036
nes-0.035
 wrist-0.033
J-0.033
Token,
Token ID11
Feature activation-6.51e-03

Pos logit contributions
suits+0.057
utes+0.053
nes+0.052
J+0.051
oe+0.050
Neg logit contributions
ocol-0.101
elling-0.092
 wolves-0.092
arter-0.091
 wolf-0.090
Token she
Token ID673
Feature activation+1.49e-02

Pos logit contributions
suits+0.011
utes+0.010
nes+0.010
oe+0.009
J+0.009
Neg logit contributions
ocol-0.014
 wolves-0.013
 wolf-0.013
 worries-0.012
 hunger-0.012
Token gentle
Token ID10296
Feature activation-4.21e-03

Pos logit contributions
suits+0.037
utes+0.035
nes+0.034
J+0.033
oe+0.032
Neg logit contributions
ocol-0.053
 wolves-0.048
 wolf-0.047
elling-0.047
 worries-0.047
Token push
Token ID4574
Feature activation-2.15e-03

Pos logit contributions
suits+0.007
utes+0.007
nes+0.006
J+0.006
oe+0.006
Neg logit contributions
ocol-0.009
 wolves-0.008
 wolf-0.008
 worries-0.008
 hunger-0.008
Token.
Token ID13
Feature activation+1.30e-02

Pos logit contributions
suits+0.003
utes+0.003
nes+0.003
J+0.003
 wrist+0.003
Neg logit contributions
ocol-0.005
elling-0.004
 wolves-0.004
 worries-0.004
 wolf-0.004
Token The
Token ID383
Feature activation+7.48e-03

Pos logit contributions
ocol+0.026
 wolves+0.023
elling+0.023
 wolf+0.023
 worries+0.023
Neg logit contributions
suits-0.024
utes-0.022
nes-0.022
J-0.021
oe-0.021
Token girl
Token ID2576
Feature activation-1.70e-03

Pos logit contributions
ocol+0.016
arter+0.014
opl+0.013
 wolves+0.013
elling+0.013
Neg logit contributions
suits-0.013
utes-0.012
 wrist-0.012
oe-0.012
nes-0.012
Token smiled
Token ID13541
Feature activation+1.33e-02

Pos logit contributions
suits+0.003
utes+0.003
nes+0.003
J+0.003
oe+0.003
Neg logit contributions
ocol-0.004
 wolves-0.003
 wolf-0.003
elling-0.003
 worries-0.003
Token.
Token ID13
Feature activation+1.53e-02

Pos logit contributions
ocol+0.031
 wolves+0.029
 wolf+0.029
elling+0.028
 worries+0.028
Neg logit contributions
suits-0.019
utes-0.018
nes-0.017
J-0.017
oe-0.016
Token She
Token ID1375
Feature activation+9.60e-03

Pos logit contributions
ocol+0.029
 wolves+0.026
elling+0.025
 wolf+0.025
 worries+0.025
Neg logit contributions
suits-0.029
utes-0.028
nes-0.027
J-0.027
oe-0.026
Token was
Token ID373
Feature activation-4.96e-03

Pos logit contributions
ocol+0.022
 wolves+0.020
arter+0.020
elling+0.019
 hunger+0.019
Neg logit contributions
suits-0.015
utes-0.014
nes-0.014
J-0.013
 wrist-0.013
Token so
Token ID523
Feature activation+2.07e-02

Pos logit contributions
suits+0.008
utes+0.008
nes+0.007
J+0.007
oe+0.007
Neg logit contributions
ocol-0.011
 wolves-0.010
 wolf-0.010
 worries-0.010
 hunger-0.009
Token happy
Token ID3772
Feature activation-3.05e-02

Pos logit contributions
ocol+0.039
 worries+0.035
elling+0.035
arter+0.035
 wolf+0.035
Neg logit contributions
suits-0.039
utes-0.035
nes-0.035
 wrist-0.033
suit-0.032
Token known
Token ID1900
Feature activation+1.03e-02

Pos logit contributions
ocol+0.012
 wolves+0.011
elling+0.011
arter+0.011
 wolf+0.011
Neg logit contributions
suits-0.009
utes-0.009
nes-0.009
J-0.009
oe-0.009
Token as
Token ID355
Feature activation-3.01e-02

Pos logit contributions
ocol+0.021
 wolves+0.018
elling+0.018
 wolf+0.018
arter+0.018
Neg logit contributions
suits-0.018
utes-0.018
nes-0.017
J-0.017
oe-0.016
Token the
Token ID262
Feature activation+9.58e-03

Pos logit contributions
J+0.039
suits+0.038
utes+0.038
oe+0.036
nes+0.035
Neg logit contributions
ocol-0.071
elling-0.065
opl-0.064
arter-0.064
 wolves-0.063
Token biggest
Token ID4094
Feature activation+1.06e-02

Pos logit contributions
ocol+0.019
arter+0.016
elling+0.016
 wolves+0.016
 hunger+0.016
Neg logit contributions
suits-0.017
utes-0.017
J-0.016
 wrist-0.016
oe-0.016
Token and
Token ID290
Feature activation+8.47e-03

Pos logit contributions
ocol+0.023
elling+0.022
arter+0.019
 hunger+0.018
 poisonous+0.018
Neg logit contributions
 wrist-0.014
 with-0.014
 line-0.014
 fast-0.014
suits-0.014
Token brav
Token ID49025
Feature activation+3.05e-02

Pos logit contributions
ocol+0.019
arter+0.017
elling+0.016
 hunger+0.015
opl+0.015
Neg logit contributions
suits-0.013
 wrist-0.012
nes-0.012
J-0.012
oe-0.011
Tokenest
Token ID395
Feature activation+2.83e-03

Pos logit contributions
ocol+0.044
arter+0.039
 wolves+0.037
 hunger+0.035
 wolf+0.035
Neg logit contributions
suits-0.067
z-0.065
oe-0.064
utes-0.063
j-0.062
Token helper
Token ID31904
Feature activation-1.40e-02

Pos logit contributions
ocol+0.006
elling+0.005
arter+0.005
 hunger+0.005
 poisonous+0.005
Neg logit contributions
utes-0.004
oe-0.004
They-0.004
suits-0.004
But-0.004
Token in
Token ID287
Feature activation-2.13e-02

Pos logit contributions
suits+0.020
utes+0.019
nes+0.019
J+0.018
oe+0.018
Neg logit contributions
ocol-0.033
 wolves-0.029
elling-0.029
 wolf-0.029
 worries-0.029
Token the
Token ID262
Feature activation+3.76e-03

Pos logit contributions
utes+0.027
J+0.026
suits+0.025
oe+0.025
nes+0.024
Neg logit contributions
ocol-0.049
elling-0.048
arter-0.045
 worries-0.045
 wolf-0.045
Token jungle
Token ID20712
Feature activation-4.42e-02

Pos logit contributions
ocol+0.006
arter+0.006
elling+0.005
 wolves+0.005
 worries+0.005
Neg logit contributions
suits-0.008
oe-0.007
J-0.007
 wrist-0.007
Mom-0.007
Token "
Token ID366
Feature activation-2.63e-02

Pos logit contributions
suits+0.066
utes+0.063
nes+0.062
pack+0.056
oe+0.056
Neg logit contributions
 wolf-0.047
ocol-0.045
 wolves-0.043
 hunger-0.041
 accidents-0.041
TokenLet
Token ID5756
Feature activation-8.08e-03

Pos logit contributions
suits+0.047
utes+0.043
pack+0.037
nes+0.037
oe+0.037
Neg logit contributions
elling-0.049
 accidents-0.048
 wolf-0.048
 shouldn-0.047
 wolves-0.047
Token's
Token ID338
Feature activation-7.64e-03

Pos logit contributions
suits+0.012
utes+0.011
nes+0.011
J+0.011
oe+0.011
Neg logit contributions
ocol-0.019
 wolves-0.017
arter-0.017
elling-0.017
 wolf-0.016
Token go
Token ID467
Feature activation-2.20e-02

Pos logit contributions
suits+0.011
utes+0.011
nes+0.011
 wrist+0.010
J+0.010
Neg logit contributions
ocol-0.018
 wolves-0.016
elling-0.016
 wolf-0.015
arter-0.015
Token find
Token ID1064
Feature activation+3.54e-02

Pos logit contributions
utes+0.036
suits+0.036
J+0.034
nes+0.033
 wrist+0.033
Neg logit contributions
ocol-0.045
 wolves-0.041
 wolf-0.041
elling-0.040
arter-0.040
Token it
Token ID340
Feature activation+2.13e-02

Pos logit contributions
ocol+0.085
opl+0.081
arter+0.081
 wolves+0.076
 wolf+0.075
Neg logit contributions
suits-0.045
utes-0.041
J-0.041
Anna-0.040
nes-0.039
Token together
Token ID1978
Feature activation-2.67e-02

Pos logit contributions
ocol+0.048
 wolves+0.044
 wolf+0.044
elling+0.044
 worries+0.043
Neg logit contributions
suits-0.031
utes-0.030
nes-0.028
J-0.028
 wrist-0.028
Token.
Token ID13
Feature activation+6.78e-03

Pos logit contributions
suits+0.042
utes+0.042
J+0.040
nes+0.039
oe+0.037
Neg logit contributions
ocol-0.056
 wolves-0.052
 worries-0.051
 wolf-0.050
elling-0.050
Token We
Token ID775
Feature activation-1.23e-02

Pos logit contributions
ocol+0.013
elling+0.012
 wolves+0.012
 wolf+0.011
 worries+0.011
Neg logit contributions
suits-0.013
utes-0.012
J-0.012
nes-0.012
oe-0.011
Token can
Token ID460
Feature activation-4.34e-02

Pos logit contributions
suits+0.021
nes+0.020
utes+0.019
oe+0.019
 wrist+0.018
Neg logit contributions
ocol-0.026
elling-0.023
arter-0.023
opl-0.023
 wolves-0.023
Token choose
Token ID3853
Feature activation+6.07e-03

Pos logit contributions
suits+0.077
utes+0.073
J+0.071
nes+0.066
oe+0.065
Neg logit contributions
ocol-0.087
 wolf-0.080
 worries-0.080
 wolves-0.080
 hunger-0.079
Token's
Token ID338
Feature activation-4.05e-03

Pos logit contributions
suits+0.001
utes+0.001
nes+0.001
J+0.001
oe+0.001
Neg logit contributions
ocol-0.002
 wolves-0.001
elling-0.001
arter-0.001
 wolf-0.001
Token mom
Token ID1995
Feature activation+1.86e-02

Pos logit contributions
suits+0.006
nes+0.005
utes+0.005
J+0.005
oe+0.005
Neg logit contributions
ocol-0.009
 wolf-0.009
 wolves-0.009
 worries-0.009
 hunger-0.009
Token came
Token ID1625
Feature activation+1.29e-02

Pos logit contributions
ocol+0.041
 wolves+0.035
arter+0.035
elling+0.035
 hunger+0.035
Neg logit contributions
suits-0.031
utes-0.029
J-0.028
nes-0.028
oe-0.028
Token into
Token ID656
Feature activation+7.02e-03

Pos logit contributions
ocol+0.027
 wolves+0.025
 wolf+0.025
elling+0.024
 hunger+0.024
Neg logit contributions
suits-0.021
utes-0.020
nes-0.020
J-0.019
 wrist-0.019
Token his
Token ID465
Feature activation-2.93e-03

Pos logit contributions
ocol+0.014
 wolves+0.013
elling+0.013
 wolf+0.012
 worries+0.012
Neg logit contributions
suits-0.012
utes-0.012
nes-0.011
J-0.011
oe-0.011
Token room
Token ID2119
Feature activation-4.56e-03

Pos logit contributions
suits+0.003
utes+0.003
 wrist+0.003
oe+0.003
nes+0.003
Neg logit contributions
ocol-0.008
elling-0.007
 wolves-0.007
arter-0.007
 wolf-0.007
Token and
Token ID290
Feature activation+1.36e-03

Pos logit contributions
suits+0.007
utes+0.007
nes+0.007
J+0.007
Anna+0.006
Neg logit contributions
ocol-0.009
 wolves-0.009
 wolf-0.009
 hunger-0.009
arter-0.008
Token asked
Token ID1965
Feature activation+1.43e-02

Pos logit contributions
ocol+0.004
opl+0.003
arter+0.003
elling+0.003
 wolves+0.003
Neg logit contributions
suits-0.002
J-0.002
nes-0.002
oe-0.002
utes-0.002
Token him
Token ID683
Feature activation-5.57e-03

Pos logit contributions
ocol+0.031
elling+0.030
opl+0.030
arter+0.029
 worries+0.029
Neg logit contributions
J-0.021
suits-0.021
utes-0.019
They-0.019
oe-0.019
Token to
Token ID284
Feature activation+7.40e-03

Pos logit contributions
suits+0.011
utes+0.010
nes+0.010
J+0.010
oe+0.010
Neg logit contributions
ocol-0.011
elling-0.009
 wolves-0.009
opl-0.009
arter-0.009
Token pause
Token ID14985
Feature activation+4.16e-02

Pos logit contributions
ocol+0.015
 wolves+0.013
 hunger+0.013
 wolf+0.013
 worries+0.013
Neg logit contributions
suits-0.013
utes-0.012
nes-0.011
J-0.011
z-0.011
Token He
Token ID679
Feature activation-1.02e-02

Pos logit contributions
ocol+0.066
 worries+0.059
arter+0.058
 dangers+0.058
elling+0.057
Neg logit contributions
suits-0.037
oe-0.032
 wrist-0.030
J-0.030
Mom-0.030
Token was
Token ID373
Feature activation-7.35e-03

Pos logit contributions
suits+0.014
J+0.014
utes+0.013
oe+0.013
nes+0.013
Neg logit contributions
ocol-0.025
arter-0.022
elling-0.022
 wolves-0.022
opl-0.022
Token making
Token ID1642
Feature activation-3.74e-03

Pos logit contributions
suits+0.011
utes+0.011
nes+0.011
J+0.010
oe+0.010
Neg logit contributions
ocol-0.016
elling-0.015
 wolves-0.014
 worries-0.014
arter-0.014
Token a
Token ID257
Feature activation-1.40e-02

Pos logit contributions
suits+0.005
utes+0.005
J+0.005
They+0.005
nes+0.005
Neg logit contributions
ocol-0.008
arter-0.008
elling-0.008
opl-0.007
 wolves-0.007
Token hard
Token ID1327
Feature activation+1.43e-02

Pos logit contributions
suits+0.023
utes+0.022
J+0.021
oe+0.020
 wrist+0.020
Neg logit contributions
ocol-0.030
elling-0.027
 wolves-0.026
 worries-0.026
arter-0.025
Token toy
Token ID13373
Feature activation+2.50e-03

Pos logit contributions
ocol+0.028
 wolves+0.025
elling+0.025
 worries+0.024
 wolf+0.024
Neg logit contributions
suits-0.026
utes-0.025
nes-0.023
oe-0.023
J-0.023
Token animal
Token ID5044
Feature activation+1.74e-02

Pos logit contributions
ocol+0.006
elling+0.005
 worries+0.005
 hunger+0.005
 wolves+0.005
Neg logit contributions
suits-0.004
utes-0.004
nes-0.003
 wrist-0.003
oe-0.003
Token.
Token ID13
Feature activation+2.26e-02

Pos logit contributions
ocol+0.043
elling+0.039
 wolves+0.037
arter+0.037
 hunger+0.037
Neg logit contributions
suits-0.023
utes-0.022
 wrist-0.020
nes-0.020
oe-0.020
Token Tim
Token ID5045
Feature activation+1.12e-03

Pos logit contributions
ocol+0.051
arter+0.045
 wolves+0.044
 worries+0.044
elling+0.043
Neg logit contributions
suits-0.036
oe-0.033
utes-0.032
J-0.032
 wrist-0.031
Tokenmy
Token ID1820
Feature activation+2.67e-04

Pos logit contributions
ocol+0.002
 wolves+0.002
elling+0.002
arter+0.002
 wolf+0.002
Neg logit contributions
suits-0.002
utes-0.002
nes-0.002
J-0.002
oe-0.002
Token thought
Token ID1807
Feature activation+1.26e-02

Pos logit contributions
ocol+0.000
 wolves+0.000
 wolf+0.000
 worries+0.000
elling+0.000
Neg logit contributions
suits-0.001
utes-0.000
nes-0.000
J-0.000
oe-0.000
Token standing
Token ID5055
Feature activation-1.86e-02

Pos logit contributions
ocol+0.006
elling+0.006
 worries+0.006
 wolves+0.005
opl+0.005
Neg logit contributions
suits-0.004
oe-0.004
utes-0.003
 wrist-0.003
J-0.003
Token in
Token ID287
Feature activation-4.81e-03

Pos logit contributions
suits+0.031
utes+0.029
nes+0.028
 wrist+0.027
ana+0.025
Neg logit contributions
ocol-0.039
 wolf-0.035
 wolves-0.034
elling-0.034
 emergency-0.033
Token the
Token ID262
Feature activation+9.11e-03

Pos logit contributions
suits+0.008
utes+0.008
nes+0.008
J+0.007
oe+0.007
Neg logit contributions
ocol-0.010
 wolves-0.009
 wolf-0.009
elling-0.009
 worries-0.009
Token middle
Token ID3504
Feature activation-9.87e-03

Pos logit contributions
ocol+0.023
 worries+0.019
arter+0.019
 hunger+0.019
elling+0.018
Neg logit contributions
suits-0.014
Mom-0.012
Anna-0.012
 wrist-0.012
utes-0.012
Token of
Token ID286
Feature activation-9.75e-03

Pos logit contributions
suits+0.021
utes+0.020
nes+0.020
J+0.019
oe+0.019
Neg logit contributions
ocol-0.017
 wolves-0.014
 wolf-0.014
elling-0.014
 worries-0.014
Token the
Token ID262
Feature activation+2.88e-03

Pos logit contributions
suits+0.017
utes+0.016
nes+0.016
J+0.015
oe+0.015
Neg logit contributions
ocol-0.020
 wolves-0.018
elling-0.018
 wolf-0.018
 worries-0.018
Token street
Token ID4675
Feature activation-3.15e-02

Pos logit contributions
ocol+0.007
 worries+0.006
arter+0.006
elling+0.006
 hunger+0.006
Neg logit contributions
suits-0.004
utes-0.004
nes-0.004
oe-0.004
J-0.004
Token.
Token ID13
Feature activation+3.85e-03

Pos logit contributions
suits+0.042
utes+0.039
nes+0.037
J+0.037
oe+0.036
Neg logit contributions
ocol-0.078
 wolves-0.071
elling-0.070
 wolf-0.070
 worries-0.069
Token Puzz
Token ID27985
Feature activation+3.34e-02

Pos logit contributions
ocol+0.008
 wolf+0.007
 wolves+0.007
 hunger+0.007
 worries+0.007
Neg logit contributions
utes-0.006
suits-0.006
nes-0.006
J-0.005
z-0.005
Tokenled
Token ID992
Feature activation+1.22e-02

Pos logit contributions
ocol+0.063
 wolves+0.056
 hunger+0.055
 wolf+0.054
arter+0.054
Neg logit contributions
suits-0.066
utes-0.062
oe-0.060
nes-0.060
J-0.058
Token,
Token ID11
Feature activation-1.40e-02

Pos logit contributions
ocol+0.027
 wolves+0.025
elling+0.024
 wolf+0.024
arter+0.024
Neg logit contributions
suits-0.019
utes-0.018
nes-0.017
J-0.017
oe-0.017
Token very
Token ID845
Feature activation+2.51e-02

Pos logit contributions
suits+0.015
nes+0.014
utes+0.014
J+0.014
oe+0.014
Neg logit contributions
ocol-0.026
elling-0.023
 wolves-0.023
arter-0.023
 wolf-0.022
Token happy
Token ID3772
Feature activation-3.90e-02

Pos logit contributions
ocol+0.055
 wolves+0.049
elling+0.049
 wolf+0.048
arter+0.048
Neg logit contributions
suits-0.041
utes-0.039
nes-0.038
J-0.037
oe-0.036
Token,
Token ID11
Feature activation-3.35e-02

Pos logit contributions
suits+0.056
utes+0.053
nes+0.051
J+0.049
oe+0.048
Neg logit contributions
ocol-0.092
 wolves-0.084
 wolf-0.083
elling-0.083
 worries-0.082
Token but
Token ID475
Feature activation+2.90e-02

Pos logit contributions
suits+0.042
utes+0.039
nes+0.038
J+0.037
 wrist+0.036
Neg logit contributions
ocol-0.085
 wolves-0.077
elling-0.077
arter-0.076
 wolf-0.075
Token he
Token ID339
Feature activation+1.44e-03

Pos logit contributions
ocol+0.073
elling+0.065
arter+0.064
 wolves+0.064
opl+0.061
Neg logit contributions
J-0.037
But-0.036
oe-0.036
nes-0.035
suits-0.035
Token was
Token ID373
Feature activation-8.49e-03

Pos logit contributions
ocol+0.004
arter+0.003
elling+0.003
 wolves+0.003
opl+0.003
Neg logit contributions
suits-0.002
nes-0.002
J-0.002
oe-0.002
Anna-0.002
Token also
Token ID635
Feature activation+5.29e-03

Pos logit contributions
suits+0.011
nes+0.010
utes+0.010
J+0.010
oe+0.010
Neg logit contributions
ocol-0.021
elling-0.019
 wolves-0.019
arter-0.019
 wolf-0.019
Token sure
Token ID1654
Feature activation-2.81e-03

Pos logit contributions
ocol+0.013
elling+0.011
 wolves+0.011
arter+0.011
 wolf+0.011
Neg logit contributions
suits-0.008
utes-0.007
nes-0.007
J-0.007
oe-0.007
Token to
Token ID284
Feature activation-2.34e-03

Pos logit contributions
suits+0.004
nes+0.004
J+0.004
utes+0.004
They+0.004
Neg logit contributions
ocol-0.007
arter-0.006
elling-0.006
opl-0.006
 hunger-0.006
Token always
Token ID1464
Feature activation-1.40e-02

Pos logit contributions
suits+0.003
nes+0.003
utes+0.003
J+0.003
 wrist+0.003
Neg logit contributions
ocol-0.005
 wolves-0.005
elling-0.005
 wolf-0.005
arter-0.005
Token be
Token ID307
Feature activation+2.52e-03

Pos logit contributions
suits+0.020
 wrist+0.019
nes+0.019
utes+0.019
J+0.018
Neg logit contributions
ocol-0.033
elling-0.030
 wolves-0.029
 wolf-0.029
arter-0.029
TokenWhen
Token ID2215
Feature activation-2.75e-02

Pos logit contributions
ocol+0.020
opl+0.019
 wolves+0.019
 wolf+0.019
 dangers+0.018
Neg logit contributions
suits-0.013
suit-0.011
 wrist-0.011
utes-0.011
 hips-0.010
Token they
Token ID484
Feature activation-1.49e-03

Pos logit contributions
suits+0.036
utes+0.035
J+0.035
oe+0.034
nes+0.034
Neg logit contributions
ocol-0.067
elling-0.062
arter-0.059
 wolves-0.058
opl-0.058
Token got
Token ID1392
Feature activation+1.78e-02

Pos logit contributions
suits+0.002
nes+0.002
J+0.002
utes+0.002
oe+0.002
Neg logit contributions
ocol-0.003
 wolves-0.003
elling-0.003
 wolf-0.003
arter-0.003
Token home
Token ID1363
Feature activation+4.55e-02

Pos logit contributions
ocol+0.033
 wolves+0.030
 wolf+0.030
 hunger+0.030
 worries+0.030
Neg logit contributions
suits-0.034
utes-0.033
nes-0.032
J-0.031
 wrist-0.031
Token,
Token ID11
Feature activation-7.53e-03

Pos logit contributions
ocol+0.112
 wolves+0.101
elling+0.101
opl+0.099
 wolf+0.098
Neg logit contributions
suits-0.061
utes-0.057
oe-0.057
nes-0.056
J-0.054
Token Jake
Token ID14757
Feature activation-2.21e-02

Pos logit contributions
suits+0.012
utes+0.011
nes+0.011
J+0.011
oe+0.010
Neg logit contributions
ocol-0.017
elling-0.015
arter-0.015
 wolves-0.015
opl-0.015
Token gave
Token ID2921
Feature activation+1.81e-02

Pos logit contributions
suits+0.036
utes+0.034
nes+0.033
J+0.032
oe+0.031
Neg logit contributions
ocol-0.048
 wolves-0.044
elling-0.043
 wolf-0.043
 worries-0.042
Token the
Token ID262
Feature activation+4.79e-03

Pos logit contributions
ocol+0.036
 wolves+0.031
elling+0.031
 wolf+0.031
 worries+0.031
Neg logit contributions
suits-0.033
utes-0.032
nes-0.031
J-0.030
oe-0.030
Token puppet
Token ID30095
Feature activation+4.34e-03

Pos logit contributions
ocol+0.010
elling+0.009
 wolves+0.009
arter+0.009
opl+0.008
Neg logit contributions
suits-0.008
utes-0.008
 wrist-0.008
nes-0.008
J-0.007
Token a
Token ID257
Feature activation-2.79e-02

Pos logit contributions
ocol+0.010
 wolves+0.009
elling+0.009
 wolf+0.009
arter+0.008
Neg logit contributions
suits-0.007
utes-0.006
nes-0.006
J-0.006
oe-0.006
Token big
Token ID1263
Feature activation-9.50e-03

Pos logit contributions
suits+0.045
utes+0.043
nes+0.042
J+0.040
oe+0.039
Neg logit contributions
ocol-0.060
 wolves-0.054
 wolf-0.054
 worries-0.054
elling-0.054
Token animals
Token ID4695
Feature activation-2.32e-02

Pos logit contributions
suits+0.028
utes+0.026
nes+0.025
J+0.025
 wrist+0.025
Neg logit contributions
ocol-0.031
elling-0.027
arter-0.027
 wolves-0.027
 hunger-0.026
Token again
Token ID757
Feature activation+2.79e-02

Pos logit contributions
suits+0.031
utes+0.029
nes+0.029
oe+0.027
J+0.027
Neg logit contributions
ocol-0.057
 wolves-0.051
elling-0.051
 wolf-0.051
arter-0.051
Token.
Token ID13
Feature activation+1.32e-03

Pos logit contributions
ocol+0.068
 wolves+0.062
 wolf+0.061
elling+0.061
arter+0.061
Neg logit contributions
suits-0.038
utes-0.036
nes-0.034
J-0.033
oe-0.033
Token Maybe
Token ID6674
Feature activation+8.01e-03

Pos logit contributions
ocol+0.003
 wolves+0.002
elling+0.002
 wolf+0.002
arter+0.002
Neg logit contributions
suits-0.002
utes-0.002
nes-0.002
J-0.002
oe-0.002
Token we
Token ID356
Feature activation-1.73e-02

Pos logit contributions
ocol+0.021
 wolves+0.019
elling+0.019
arter+0.019
 wolf+0.019
Neg logit contributions
suits-0.010
utes-0.009
nes-0.009
J-0.008
oe-0.008
Token will
Token ID481
Feature activation-4.30e-02

Pos logit contributions
suits+0.028
nes+0.025
utes+0.024
oe+0.024
They+0.023
Neg logit contributions
ocol-0.040
arter-0.036
elling-0.035
opl-0.035
 wolves-0.035
Token see
Token ID766
Feature activation+1.51e-02

Pos logit contributions
suits+0.073
utes+0.069
J+0.066
nes+0.065
oe+0.064
Neg logit contributions
ocol-0.089
 wolves-0.081
 wolf-0.080
 hunger-0.080
 worries-0.080
Token some
Token ID617
Feature activation+3.57e-04

Pos logit contributions
ocol+0.036
arter+0.033
opl+0.031
elling+0.031
 hunger+0.031
Neg logit contributions
suits-0.020
utes-0.020
nes-0.018
They-0.018
J-0.018
Token new
Token ID649
Feature activation-1.10e-02

Pos logit contributions
ocol+0.001
arter+0.001
elling+0.001
opl+0.001
 hunger+0.001
Neg logit contributions
suits-0.001
utes-0.001
They-0.001
Anna-0.001
 wrist-0.001
Token animals
Token ID4695
Feature activation-2.58e-02

Pos logit contributions
suits+0.021
utes+0.019
 wrist+0.018
nes+0.018
oe+0.017
Neg logit contributions
ocol-0.022
arter-0.020
elling-0.020
 hunger-0.019
opl-0.019
Token or
Token ID393
Feature activation-1.89e-02

Pos logit contributions
suits+0.035
utes+0.032
nes+0.032
oe+0.030
 wrist+0.029
Neg logit contributions
ocol-0.064
 wolves-0.057
elling-0.057
arter-0.057
 wolf-0.057
Token with
Token ID351
Feature activation-5.32e-03

Pos logit contributions
suits+0.071
utes+0.067
nes+0.065
J+0.064
oe+0.063
Neg logit contributions
ocol-0.094
 wolves-0.085
 wolf-0.083
elling-0.083
 worries-0.083
Token the
Token ID262
Feature activation+1.40e-02

Pos logit contributions
suits+0.006
utes+0.006
J+0.006
 wrist+0.006
They+0.006
Neg logit contributions
ocol-0.013
elling-0.012
opl-0.012
arter-0.012
 wolf-0.012
Token match
Token ID2872
Feature activation-6.70e-03

Pos logit contributions
ocol+0.032
elling+0.030
 wolves+0.029
 hunger+0.029
arter+0.029
Neg logit contributions
suits-0.019
oe-0.019
utes-0.019
nes-0.017
pack-0.017
Token.
Token ID13
Feature activation+9.04e-03

Pos logit contributions
suits+0.009
utes+0.008
nes+0.008
oe+0.007
 wrist+0.007
Neg logit contributions
ocol-0.017
 wolves-0.015
elling-0.015
 wolf-0.015
 hunger-0.015
Token They
Token ID1119
Feature activation-2.15e-02

Pos logit contributions
ocol+0.021
 wolves+0.019
elling+0.019
 wolf+0.019
 worries+0.019
Neg logit contributions
suits-0.013
utes-0.012
nes-0.012
J-0.012
oe-0.011
Token played
Token ID2826
Feature activation-5.24e-02

Pos logit contributions
suits+0.036
utes+0.035
nes+0.034
J+0.033
oe+0.032
Neg logit contributions
ocol-0.045
 wolves-0.041
 wolf-0.040
elling-0.040
 worries-0.040
Token a
Token ID257
Feature activation-1.74e-02

Pos logit contributions
suits+0.067
utes+0.062
nes+0.061
oe+0.060
j+0.060
Neg logit contributions
ocol-0.120
 wolves-0.120
 wolf-0.117
 worries-0.115
 hunger-0.114
Token game
Token ID983
Feature activation-4.18e-02

Pos logit contributions
suits+0.035
j+0.032
utes+0.032
ana+0.032
J+0.032
Neg logit contributions
 poisonous-0.024
 wolf-0.023
 obstacle-0.023
 worries-0.023
 wolves-0.023
Token of
Token ID286
Feature activation-4.09e-03

Pos logit contributions
suits+0.079
utes+0.075
 wrist+0.072
nes+0.072
J+0.071
Neg logit contributions
ocol-0.071
 wolves-0.070
 worries-0.067
opl-0.066
 wolf-0.065
Token tag
Token ID7621
Feature activation-6.85e-03

Pos logit contributions
suits+0.005
utes+0.005
nes+0.005
J+0.005
oe+0.005
Neg logit contributions
ocol-0.010
 wolves-0.009
 wolf-0.009
 worries-0.009
elling-0.009
Token,
Token ID11
Feature activation-3.70e-02

Pos logit contributions
suits+0.010
utes+0.009
nes+0.009
J+0.009
oe+0.008
Neg logit contributions
ocol-0.016
 wolves-0.015
 wolf-0.014
 worries-0.014
 hunger-0.014
Token was
Token ID373
Feature activation-3.39e-03

Pos logit contributions
ocol+0.036
arter+0.030
elling+0.030
 wolves+0.030
 hunger+0.030
Neg logit contributions
suits-0.020
nes-0.018
J-0.017
Anna-0.017
oe-0.017
Token wrong
Token ID2642
Feature activation-2.41e-02

Pos logit contributions
J+0.005
nes+0.004
utes+0.004
suits+0.004
oe+0.004
Neg logit contributions
ocol-0.008
elling-0.007
opl-0.007
 wolves-0.007
 worries-0.007
Token.
Token ID13
Feature activation-4.01e-03

Pos logit contributions
suits+0.034
utes+0.032
nes+0.031
J+0.029
oe+0.029
Neg logit contributions
ocol-0.057
 wolves-0.052
 worries-0.051
 wolf-0.051
elling-0.051
Token She
Token ID1375
Feature activation+3.62e-04

Pos logit contributions
suits+0.008
utes+0.007
nes+0.007
oe+0.007
J+0.007
Neg logit contributions
ocol-0.008
 wolves-0.007
elling-0.007
 wolf-0.007
arter-0.006
Token said
Token ID531
Feature activation+6.03e-03

Pos logit contributions
ocol+0.001
arter+0.001
opl+0.001
elling+0.001
 wolves+0.001
Neg logit contributions
suits-0.001
J-0.000
nes-0.000
Anna-0.000
They-0.000
Token,
Token ID11
Feature activation-4.32e-02

Pos logit contributions
ocol+0.014
 wolves+0.013
elling+0.013
arter+0.012
 wolf+0.012
Neg logit contributions
suits-0.009
utes-0.008
nes-0.008
J-0.008
oe-0.008
Token "
Token ID366
Feature activation-3.34e-02

Pos logit contributions
suits+0.064
utes+0.060
nes+0.058
J+0.057
oe+0.055
Neg logit contributions
ocol-0.100
 wolves-0.091
elling-0.090
 wolf-0.089
 worries-0.089
TokenI
Token ID40
Feature activation+2.26e-02

Pos logit contributions
suits+0.065
utes+0.058
nes+0.053
uits+0.052
suit+0.051
Neg logit contributions
ocol-0.061
 accidents-0.060
 wolf-0.060
elling-0.058
 wolves-0.058
Token am
Token ID716
Feature activation-2.66e-02

Pos logit contributions
ocol+0.052
elling+0.047
 wolves+0.046
 wolf+0.044
opl+0.044
Neg logit contributions
suits-0.036
J-0.034
nes-0.033
oe-0.033
utes-0.032
Token sorry
Token ID7926
Feature activation-1.04e-02

Pos logit contributions
suits+0.054
utes+0.053
nes+0.052
J+0.051
oe+0.051
Neg logit contributions
ocol-0.046
 wolves-0.040
elling-0.040
 wolf-0.039
 worries-0.039
Token,
Token ID11
Feature activation-4.59e-02

Pos logit contributions
suits+0.014
utes+0.014
J+0.013
nes+0.013
oe+0.013
Neg logit contributions
ocol-0.025
 wolves-0.023
elling-0.023
 hunger-0.023
 wolf-0.022
Token the
Token ID262
Feature activation-1.57e-02

Pos logit contributions
suits+0.037
utes+0.035
nes+0.035
J+0.033
oe+0.033
Neg logit contributions
ocol-0.039
 wolves-0.035
 wolf-0.034
 hunger-0.034
elling-0.033
Token sky
Token ID6766
Feature activation-2.40e-02

Pos logit contributions
suits+0.029
utes+0.027
nes+0.026
 wrist+0.026
J+0.026
Neg logit contributions
ocol-0.031
 wolves-0.027
arter-0.027
elling-0.027
 hunger-0.026
Token.
Token ID13
Feature activation-2.00e-04

Pos logit contributions
suits+0.034
utes+0.032
nes+0.030
J+0.030
oe+0.029
Neg logit contributions
ocol-0.057
 wolves-0.052
elling-0.051
 wolf-0.051
 worries-0.051
Token It
Token ID632
Feature activation+9.98e-03

Pos logit contributions
suits+0.000
nes+0.000
utes+0.000
pack+0.000
oe+0.000
Neg logit contributions
ocol-0.000
 wolves-0.000
elling-0.000
 wolf-0.000
arter-0.000
Token was
Token ID373
Feature activation-1.86e-02

Pos logit contributions
ocol+0.023
 wolves+0.020
elling+0.020
arter+0.020
 wolf+0.020
Neg logit contributions
suits-0.016
utes-0.014
nes-0.014
J-0.014
oe-0.014
Token like
Token ID588
Feature activation-4.41e-02

Pos logit contributions
suits+0.028
utes+0.027
nes+0.026
J+0.026
oe+0.025
Neg logit contributions
ocol-0.042
 wolves-0.038
elling-0.038
 wolf-0.037
arter-0.037
Token a
Token ID257
Feature activation-3.76e-02

Pos logit contributions
suits+0.057
utes+0.057
J+0.053
nes+0.052
 wrist+0.051
Neg logit contributions
ocol-0.108
elling-0.097
arter-0.096
 wolves-0.094
opl-0.094
Token magic
Token ID5536
Feature activation-1.93e-03

Pos logit contributions
suits+0.072
utes+0.069
nes+0.067
J+0.066
oe+0.064
Neg logit contributions
ocol-0.070
 wolves-0.063
 wolf-0.062
 worries-0.061
 hunger-0.061
Token show
Token ID905
Feature activation-1.98e-02

Pos logit contributions
suits+0.003
utes+0.003
 wrist+0.003
oe+0.003
nes+0.003
Neg logit contributions
ocol-0.004
elling-0.004
 wolves-0.004
arter-0.004
 hunger-0.004
Token.
Token ID13
Feature activation-3.21e-03

Pos logit contributions
suits+0.029
utes+0.027
nes+0.026
J+0.026
oe+0.025
Neg logit contributions
ocol-0.046
 wolves-0.042
elling-0.041
 wolf-0.041
 worries-0.041
Token Tim
Token ID5045
Feature activation-1.08e-02

Pos logit contributions
suits+0.005
nes+0.005
utes+0.005
oe+0.004
z+0.004
Neg logit contributions
ocol-0.007
 wolves-0.006
elling-0.006
 wolf-0.006
arter-0.006
Token eat
Token ID4483
Feature activation+7.47e-03

Pos logit contributions
suits+0.011
 wrist+0.011
They+0.011
utes+0.011
J+0.011
Neg logit contributions
ocol-0.019
elling-0.018
 wolves-0.018
 worries-0.017
 wolf-0.017
Token it
Token ID340
Feature activation+1.77e-02

Pos logit contributions
ocol+0.018
arter+0.016
elling+0.016
 wolves+0.016
 worries+0.016
Neg logit contributions
suits-0.010
utes-0.009
They-0.009
J-0.008
nes-0.008
Token,
Token ID11
Feature activation-3.12e-02

Pos logit contributions
ocol+0.049
arter+0.043
 wolves+0.043
elling+0.043
 wolf+0.043
Neg logit contributions
suits-0.020
utes-0.017
nes-0.016
oe-0.016
They-0.016
Token but
Token ID475
Feature activation+1.98e-02

Pos logit contributions
suits+0.040
utes+0.038
nes+0.036
 wrist+0.036
They+0.035
Neg logit contributions
ocol-0.077
elling-0.070
opl-0.069
 wolves-0.069
arter-0.068
Token I
Token ID314
Feature activation+1.44e-02

Pos logit contributions
ocol+0.054
opl+0.054
elling+0.051
 wolves+0.050
arter+0.049
Neg logit contributions
They-0.016
 first-0.016
utes-0.016
 wrist-0.015
suits-0.015
Token agree
Token ID4236
Feature activation-4.49e-02

Pos logit contributions
ocol+0.035
 wolves+0.033
elling+0.032
 wolf+0.032
 earthquake+0.031
Neg logit contributions
suits-0.020
nes-0.018
oe-0.018
J-0.017
Anna-0.017
Token,
Token ID11
Feature activation-4.24e-02

Pos logit contributions
suits+0.058
utes+0.056
nes+0.053
J+0.052
oe+0.050
Neg logit contributions
ocol-0.111
 wolves-0.101
elling-0.100
 wolf-0.100
 worries-0.099
Token it
Token ID340
Feature activation+2.24e-02

Pos logit contributions
suits+0.061
utes+0.058
nes+0.057
 wrist+0.056
J+0.056
Neg logit contributions
ocol-0.097
elling-0.087
 wolves-0.086
arter-0.086
opl-0.085
Token is
Token ID318
Feature activation-1.60e-02

Pos logit contributions
ocol+0.055
arter+0.046
 hunger+0.045
 wolves+0.045
 wolf+0.045
Neg logit contributions
suits-0.033
Anna-0.030
nes-0.029
oe-0.029
a-0.029
Token sm
Token ID895
Feature activation-5.14e-03

Pos logit contributions
suits+0.017
They+0.016
utes+0.016
ends+0.015
oe+0.015
Neg logit contributions
ocol-0.042
elling-0.038
opl-0.038
 wolves-0.037
arter-0.036
Tokenelly
Token ID6148
Feature activation+7.26e-03

Pos logit contributions
suits+0.012
utes+0.012
J+0.011
nes+0.011
oe+0.011
Neg logit contributions
ocol-0.007
 wolves-0.006
 wolf-0.006
 worries-0.006
 hunger-0.006