TOP ACTIVATIONS
MAX = 0.104

is shrimp . It is a small fish that lives in
This is shrimp . It is a small fish that lives
juice . Sam wanted a pizza and a milk . Mom
. He thinks it is a cool gun .
lunch . One had a tuna sandwich and the other an
They had a lot of choices . They were not bored
wanted to look for a sport he could do . Then
. This shrimp is not a food . It is a
These are peanuts . They are interesting because they grow underground
regular pizza . A regular pizza has cheese and sauce and
to play with toys . Ben has a new toy .
" Because tanks are loud and scary ," Mia
and look for bugs . Ben finds a big worm and
. He found some plastic containers . He thought they looked
" A mineral is a kind of rock that comes
ily offered Tim my some food , but he said no
, " It 's a tuna , and we can catch
family . He had a tuna ! He wanted to show
basket , Jack found delicious sandwiches , apples , and oranges
Tim wanted to find some food , so he hopped around

MIN ACTIVATIONS
MIN = -0.135

the sky fair ies spinning their magical club ."
ies spinning their magical club ." Sarah
in a world of make - bel ieve .
. It will make your ear better . Do you want
horn and made v room - v room noises .
The bright colors made Ab ig ail so happy .
sun would make the sc ooter too hot to touch .
light club spinning in the sky !"
his red cape and make - bel ieve he had super
day driving and making the world pretty . And from that
the little girl and her arm chair were the best of
shining brightly , making the world look so beautiful . She
from home and eventually stumbled upon a weird stage . On
when he was playing make - bel ieve at home ,
a light club spinning in the sky
blanket before it was torn S ue decided
way . This made the dress look different , with a
louder . Suddenly , the ringing stopped and a funny voice
song and make the toy chain dance in the air .
. She will make your arm better . Come on ,

INTERVAL 0.044 - 0.104
CONTAINS 1.922%

him . He is not a bad dog . He is
and un pack his favourite food . He was delighted to
and he asked his brother , " What is that ?"
so he asked his friend , Sam , for help .
up . They decide to share the car . They take

INTERVAL -0.015 - 0.044
CONTAINS 73.703%

" Can we pet him ?" Ben asked the man .
. The snake calmed down and sn ugg led further into
The rock flies up and hits Tim in the head .
Then , they spotted a giant pile of leaves .
of all , she was sad because she missed her home

INTERVAL -0.075 - -0.015
CONTAINS 24.232%

too . The little girl couldn 't believe it - the
books . Lily 's eyes got big when she saw all
. " What 's in the box ?" he asked .
on her face and legs . She laughed and sang .
hands and scattered it everywhere . Suddenly , his

INTERVAL -0.135 - -0.075
CONTAINS 0.143%

her magical whip around and made funny shapes in the air
store to buy some soap . She bought the soap and
But Lily didn 't listen . She wanted to talk to
laughed and p atted the pony 's back . They became
, after sailing far away and getting lost , it started
Token is
Token ID318
Feature activation+7.89e-02

Pos logit contributions
 Wally+0.093
 fisher+0.091
 Sebastian+0.091
 Finn+0.088
inn+0.087
Neg logit contributions
est-0.069
ness-0.067
bows-0.066
 sounds-0.062
als-0.061
Token shrimp
Token ID30060
Feature activation+4.87e-02

Pos logit contributions
 fisher+0.128
 fisherman+0.120
Dave+0.120
 Sebastian+0.117
olph+0.112
Neg logit contributions
 ceiling-0.088
clock-0.088
 walls-0.085
appointed-0.084
 Lightning-0.084
Token.
Token ID13
Feature activation+5.76e-02

Pos logit contributions
 Sebastian+0.077
 Wally+0.074
 fisher+0.073
 Finn+0.068
 fossil+0.067
Neg logit contributions
chair-0.076
appointed-0.066
bows-0.064
 halls-0.063
 ceiling-0.062
Token It
Token ID632
Feature activation+5.29e-02

Pos logit contributions
 Wally+0.098
 fisher+0.096
 tuna+0.094
 shrimp+0.093
 fossil+0.093
Neg logit contributions
chair-0.069
appointed-0.069
irds-0.066
ring-0.066
clock-0.064
Token is
Token ID318
Feature activation+9.64e-02

Pos logit contributions
 Sebastian+0.109
 fisher+0.106
 Wally+0.106
 fisherman+0.101
Dave+0.099
Neg logit contributions
als-0.045
bows-0.044
chair-0.044
appointed-0.043
 sounds-0.042
Token a
Token ID257
Feature activation+9.69e-02

Pos logit contributions
 fisher+0.162
Dave+0.154
 Sebastian+0.154
 fisherman+0.152
 Wally+0.149
Neg logit contributions
 walls-0.099
clock-0.098
 sounds-0.097
 ceiling-0.096
 same-0.094
Token small
Token ID1402
Feature activation+8.89e-02

Pos logit contributions
 Sebastian+0.161
 Wally+0.157
 Fin+0.149
 Finn+0.147
 Otto+0.145
Neg logit contributions
 roof-0.116
 ceiling-0.114
 wall-0.109
 whole-0.107
 song-0.106
Token fish
Token ID5916
Feature activation+6.40e-02

Pos logit contributions
 Wally+0.105
Stan+0.098
 Sebastian+0.098
 Fin+0.093
 Finn+0.093
Neg logit contributions
ness-0.164
est-0.155
 roof-0.147
 ceiling-0.147
 wall-0.139
Token that
Token ID326
Feature activation+5.14e-02

Pos logit contributions
 Sebastian+0.107
 Wally+0.106
 Otto+0.097
 fisher+0.095
 Finn+0.095
Neg logit contributions
chair-0.094
ring-0.083
age-0.080
bows-0.077
 roof-0.075
Token lives
Token ID3160
Feature activation+2.99e-02

Pos logit contributions
 Sebastian+0.085
 Wally+0.081
 fisher+0.079
 fossil+0.075
 tuna+0.075
Neg logit contributions
chair-0.067
appointed-0.064
 halls-0.063
als-0.062
 ceiling-0.061
Token in
Token ID287
Feature activation+4.36e-02

Pos logit contributions
 Wally+0.045
 Sebastian+0.045
 fisher+0.042
 tuna+0.042
 Otto+0.040
Neg logit contributions
chair-0.049
appointed-0.049
 halls-0.047
irds-0.045
clock-0.045
TokenThis
Token ID1212
Feature activation+5.69e-02

Pos logit contributions
 Wally+0.050
 tuna+0.047
 Sebastian+0.046
 fisher+0.044
 Finn+0.044
Neg logit contributions
appointed-0.045
chair-0.043
irds-0.040
ring-0.040
clock-0.040
Token is
Token ID318
Feature activation+7.89e-02

Pos logit contributions
 Wally+0.093
 fisher+0.091
 Sebastian+0.091
 Finn+0.088
inn+0.087
Neg logit contributions
est-0.069
ness-0.067
bows-0.066
 sounds-0.062
als-0.061
Token shrimp
Token ID30060
Feature activation+4.87e-02

Pos logit contributions
 fisher+0.128
 fisherman+0.120
Dave+0.120
 Sebastian+0.117
olph+0.112
Neg logit contributions
 ceiling-0.088
clock-0.088
 walls-0.085
appointed-0.084
 Lightning-0.084
Token.
Token ID13
Feature activation+5.76e-02

Pos logit contributions
 Sebastian+0.077
 Wally+0.074
 fisher+0.073
 Finn+0.068
 fossil+0.067
Neg logit contributions
chair-0.076
appointed-0.066
bows-0.064
 halls-0.063
 ceiling-0.062
Token It
Token ID632
Feature activation+5.29e-02

Pos logit contributions
 Wally+0.098
 fisher+0.096
 tuna+0.094
 shrimp+0.093
 fossil+0.093
Neg logit contributions
chair-0.069
appointed-0.069
irds-0.066
ring-0.066
clock-0.064
Token is
Token ID318
Feature activation+9.64e-02

Pos logit contributions
 Sebastian+0.109
 fisher+0.106
 Wally+0.106
 fisherman+0.101
Dave+0.099
Neg logit contributions
als-0.045
bows-0.044
chair-0.044
appointed-0.043
 sounds-0.042
Token a
Token ID257
Feature activation+9.69e-02

Pos logit contributions
 fisher+0.162
Dave+0.154
 Sebastian+0.154
 fisherman+0.152
 Wally+0.149
Neg logit contributions
 walls-0.099
clock-0.098
 sounds-0.097
 ceiling-0.096
 same-0.094
Token small
Token ID1402
Feature activation+8.89e-02

Pos logit contributions
 Sebastian+0.161
 Wally+0.157
 Fin+0.149
 Finn+0.147
 Otto+0.145
Neg logit contributions
 roof-0.116
 ceiling-0.114
 wall-0.109
 whole-0.107
 song-0.106
Token fish
Token ID5916
Feature activation+6.40e-02

Pos logit contributions
 Wally+0.105
Stan+0.098
 Sebastian+0.098
 Fin+0.093
 Finn+0.093
Neg logit contributions
ness-0.164
est-0.155
 roof-0.147
 ceiling-0.147
 wall-0.139
Token that
Token ID326
Feature activation+5.14e-02

Pos logit contributions
 Sebastian+0.107
 Wally+0.106
 Otto+0.097
 fisher+0.095
 Finn+0.095
Neg logit contributions
chair-0.094
ring-0.083
age-0.080
bows-0.077
 roof-0.075
Token lives
Token ID3160
Feature activation+2.99e-02

Pos logit contributions
 Sebastian+0.085
 Wally+0.081
 fisher+0.079
 fossil+0.075
 tuna+0.075
Neg logit contributions
chair-0.067
appointed-0.064
 halls-0.063
als-0.062
 ceiling-0.061
Token juice
Token ID13135
Feature activation+7.92e-02

Pos logit contributions
 Sebastian+0.093
 Wally+0.092
 Fin+0.086
 Finn+0.085
 Otto+0.083
Neg logit contributions
 roof-0.071
chair-0.069
 ceiling-0.066
appointed-0.062
bows-0.060
Token.
Token ID13
Feature activation-2.37e-02

Pos logit contributions
 Sebastian+0.144
 fossil+0.141
 fisher+0.136
 Wally+0.131
 Fin+0.130
Neg logit contributions
chair-0.099
 ceiling-0.090
 roof-0.088
als-0.085
bows-0.084
Token Sam
Token ID3409
Feature activation+4.80e-02

Pos logit contributions
appointed+0.038
unct+0.037
chair+0.037
ring+0.034
 halls+0.033
Neg logit contributions
 Wally-0.041
 tuna-0.038
 shrimp-0.037
 Sebastian-0.037
 Finn-0.037
Token wanted
Token ID2227
Feature activation+3.73e-02

Pos logit contributions
 Sebastian+0.082
 Wally+0.081
 fisher+0.080
 fossil+0.079
 tuna+0.077
Neg logit contributions
chair-0.068
appointed-0.063
cases-0.059
 halls-0.059
up-0.054
Token a
Token ID257
Feature activation+4.54e-02

Pos logit contributions
 fisher+0.075
 Sebastian+0.073
 fossil+0.073
 Wally+0.071
 fisherman+0.070
Neg logit contributions
chair-0.034
clock-0.033
 halls-0.033
up-0.032
ups-0.031
Token pizza
Token ID14256
Feature activation+9.63e-02

Pos logit contributions
 Sebastian+0.081
 Wally+0.078
 Fin+0.075
 Finn+0.073
 fisher+0.071
Neg logit contributions
chair-0.054
 roof-0.051
appointed-0.049
irds-0.048
 ceiling-0.047
Token and
Token ID290
Feature activation+2.06e-02

Pos logit contributions
 Sebastian+0.182
 fossil+0.168
 aimed+0.166
 Wally+0.166
Jon+0.165
Neg logit contributions
chair-0.098
bye-0.089
clock-0.084
 morning-0.083
cakes-0.082
Token a
Token ID257
Feature activation+5.62e-02

Pos logit contributions
 Wally+0.032
 fisher+0.032
 Sebastian+0.031
 fossil+0.031
 tuna+0.030
Neg logit contributions
chair-0.032
appointed-0.031
 halls-0.028
irds-0.027
clock-0.027
Token milk
Token ID7545
Feature activation+7.40e-02

Pos logit contributions
 Sebastian+0.101
 Wally+0.098
 Fin+0.093
 Finn+0.092
 Otto+0.089
Neg logit contributions
 roof-0.069
chair-0.065
 ceiling-0.063
appointed-0.060
clock-0.058
Token.
Token ID13
Feature activation-4.60e-03

Pos logit contributions
 fossil+0.147
 fisher+0.139
 Sebastian+0.137
 Wally+0.135
Bruce+0.132
Neg logit contributions
chair-0.074
ness-0.068
als-0.066
 roof-0.066
bows-0.066
Token Mom
Token ID11254
Feature activation+5.06e-02

Pos logit contributions
appointed+0.009
chair+0.009
unct+0.008
irds+0.008
 halls+0.008
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token.
Token ID13
Feature activation+3.07e-02

Pos logit contributions
 Wally+0.031
 Sebastian+0.030
 fisher+0.029
 tuna+0.029
 shrimp+0.028
Neg logit contributions
chair-0.035
appointed-0.033
ring-0.030
bows-0.030
 halls-0.030
Token He
Token ID679
Feature activation+6.80e-02

Pos logit contributions
 shrimp+0.052
 Wally+0.051
 tuna+0.049
 Sebastian+0.049
 fisher+0.048
Neg logit contributions
chair-0.044
appointed-0.042
clock-0.040
 Lightning-0.040
irds-0.039
Token thinks
Token ID6834
Feature activation+3.35e-02

Pos logit contributions
 Sebastian+0.142
 Wally+0.140
 Otto+0.131
inn+0.127
Dave+0.127
Neg logit contributions
 sounds-0.065
 starts-0.061
 feels-0.058
bows-0.056
chair-0.056
Token it
Token ID340
Feature activation+1.09e-02

Pos logit contributions
 Sebastian+0.053
 Wally+0.053
 fisher+0.051
 fossil+0.050
 shrimp+0.049
Neg logit contributions
chair-0.050
irds-0.044
ong-0.043
 halls-0.043
 ceiling-0.043
Token is
Token ID318
Feature activation+7.61e-02

Pos logit contributions
 Wally+0.020
 Sebastian+0.019
 shrimp+0.019
 tuna+0.019
 fisher+0.019
Neg logit contributions
appointed-0.015
chair-0.015
bows-0.013
 halls-0.013
unct-0.013
Token a
Token ID257
Feature activation+9.57e-02

Pos logit contributions
Dave+0.156
 fisher+0.155
 fisherman+0.154
olph+0.153
inn+0.152
Neg logit contributions
 sounds-0.060
 biggest-0.058
appointed-0.053
 same-0.048
 walls-0.046
Token cool
Token ID3608
Feature activation+5.13e-02

Pos logit contributions
 Wally+0.198
 Otto+0.195
 Sebastian+0.193
inn+0.193
 Finn+0.188
Neg logit contributions
 job-0.084
 song-0.082
 roof-0.078
 whole-0.077
 biggest-0.070
Token gun
Token ID2485
Feature activation+2.69e-02

Pos logit contributions
 Wally+0.084
 Otto+0.080
 Finn+0.077
Dave+0.077
Stan+0.075
Neg logit contributions
ness-0.077
est-0.071
 roof-0.070
bye-0.069
bows-0.067
Token.
Token ID13
Feature activation+3.59e-02

Pos logit contributions
 Wally+0.044
 Sebastian+0.043
 shrimp+0.042
 fisher+0.041
 tuna+0.041
Neg logit contributions
chair-0.042
appointed-0.039
bows-0.037
ring-0.037
clock-0.036
Token
Token ID198
Feature activation+9.45e-03

Pos logit contributions
 shrimp+0.067
 tuna+0.063
 Wally+0.063
 Sebastian+0.061
 fisher+0.061
Neg logit contributions
chair-0.046
appointed-0.044
 Lightning-0.042
clock-0.041
EEP-0.040
Token
Token ID198
Feature activation+1.68e-02

Pos logit contributions
 Wally+0.014
 tuna+0.013
 shrimp+0.013
 fisher+0.013
 Sebastian+0.013
Neg logit contributions
appointed-0.017
chair-0.017
unct-0.015
 halls-0.015
irds-0.015
Token lunch
Token ID9965
Feature activation+1.45e-02

Pos logit contributions
appointed+0.060
chair+0.056
irds+0.055
cases+0.053
unct+0.053
Neg logit contributions
 tuna-0.033
 Wally-0.033
 shrimp-0.031
 Sebastian-0.029
 fisher-0.027
Token.
Token ID13
Feature activation+1.11e-02

Pos logit contributions
 Wally+0.020
 tuna+0.019
 Sebastian+0.019
 fisher+0.018
 shrimp+0.018
Neg logit contributions
appointed-0.027
chair-0.027
 halls-0.024
unct-0.024
irds-0.024
Token One
Token ID1881
Feature activation-1.69e-02

Pos logit contributions
 Wally+0.018
 tuna+0.017
 fisher+0.017
 shrimp+0.017
 Sebastian+0.017
Neg logit contributions
appointed-0.018
chair-0.018
irds-0.016
 halls-0.015
unct-0.015
Token had
Token ID550
Feature activation-1.98e-02

Pos logit contributions
appointed+0.022
chair+0.020
unct+0.020
irds+0.019
amas+0.019
Neg logit contributions
 Wally-0.034
 tuna-0.034
 shrimp-0.033
 fisher-0.031
 Sebastian-0.031
Token a
Token ID257
Feature activation+3.37e-02

Pos logit contributions
appointed+0.039
chair+0.037
unct+0.035
irds+0.035
ring+0.033
Neg logit contributions
 Wally-0.028
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.024
Token tuna
Token ID38883
Feature activation+9.54e-02

Pos logit contributions
 Wally+0.064
 Sebastian+0.062
 fisher+0.059
 Fin+0.058
 Finn+0.057
Neg logit contributions
chair-0.043
 roof-0.040
appointed-0.040
 ceiling-0.038
 halls-0.037
Token sandwich
Token ID20433
Feature activation+6.93e-02

Pos logit contributions
 Sebastian+0.154
 Wally+0.144
 fisher+0.140
Stan+0.140
Dave+0.139
Neg logit contributions
chair-0.119
als-0.111
 ceiling-0.110
 roof-0.105
 wall-0.102
Token and
Token ID290
Feature activation+3.54e-02

Pos logit contributions
 fisher+0.113
 Sebastian+0.109
 Wally+0.108
 fossil+0.101
 Finn+0.100
Neg logit contributions
chair-0.114
bows-0.097
ring-0.095
case-0.093
cases-0.093
Token the
Token ID262
Feature activation+3.33e-02

Pos logit contributions
 Wally+0.048
 fisher+0.048
 Sebastian+0.047
 fossil+0.044
 tuna+0.043
Neg logit contributions
chair-0.059
appointed-0.058
 halls-0.054
 roof-0.052
irds-0.052
Token other
Token ID584
Feature activation+2.24e-02

Pos logit contributions
 Wally+0.040
 Sebastian+0.039
 fisher+0.036
 fossil+0.034
 Fin+0.034
Neg logit contributions
chair-0.059
appointed-0.059
unct-0.056
 ceiling-0.056
 halls-0.055
Token an
Token ID281
Feature activation+3.73e-02

Pos logit contributions
 Wally+0.036
 Sebastian+0.035
 fisher+0.033
 fossil+0.033
 tuna+0.032
Neg logit contributions
chair-0.033
appointed-0.033
 halls-0.030
ness-0.030
clock-0.030
Token They
Token ID1119
Feature activation+2.96e-02

Pos logit contributions
 fisher+0.065
 fossil+0.065
 shrimp+0.064
 tuna+0.063
 shark+0.063
Neg logit contributions
appointed-0.044
chair-0.042
clock-0.040
 Lightning-0.038
irds-0.038
Token had
Token ID550
Feature activation+7.49e-03

Pos logit contributions
 Wally+0.053
 tuna+0.050
 Sebastian+0.049
 shrimp+0.048
 fisher+0.047
Neg logit contributions
chair-0.043
appointed-0.043
 halls-0.042
 roof-0.039
unct-0.039
Token a
Token ID257
Feature activation+3.28e-02

Pos logit contributions
 Wally+0.012
 Sebastian+0.012
 tuna+0.011
 fisher+0.011
 shrimp+0.011
Neg logit contributions
appointed-0.012
chair-0.011
 halls-0.011
clock-0.011
unct-0.010
Token lot
Token ID1256
Feature activation+1.53e-02

Pos logit contributions
 Sebastian+0.068
 Wally+0.068
 Finn+0.064
 Crab+0.063
 Otto+0.063
Neg logit contributions
 roof-0.034
chair-0.032
appointed-0.031
 ceiling-0.031
 stair-0.028
Token of
Token ID286
Feature activation+1.51e-02

Pos logit contributions
 Wally+0.031
 Sebastian+0.031
 fisher+0.029
 tuna+0.029
 Otto+0.029
Neg logit contributions
chair-0.020
appointed-0.018
 halls-0.017
up-0.017
bows-0.017
Token choices
Token ID7747
Feature activation+9.50e-02

Pos logit contributions
 Wally+0.028
 Sebastian+0.027
 fisher+0.025
 Crab+0.025
 tuna+0.025
Neg logit contributions
appointed-0.022
chair-0.022
 halls-0.021
clock-0.019
bows-0.019
Token.
Token ID13
Feature activation+4.47e-02

Pos logit contributions
 Sebastian+0.215
Fin+0.209
 fossil+0.208
 Otto+0.206
 Wally+0.205
Neg logit contributions
bows-0.071
chair-0.070
 ceiling-0.061
ous-0.058
ring-0.056
Token They
Token ID1119
Feature activation+3.77e-02

Pos logit contributions
 fisher+0.082
 fossil+0.081
 Sebastian+0.080
 fisherman+0.079
 shark+0.079
Neg logit contributions
chair-0.049
clock-0.048
appointed-0.048
 Everyone-0.046
 Lightning-0.045
Token were
Token ID547
Feature activation+2.76e-02

Pos logit contributions
 Wally+0.068
 Sebastian+0.064
 tuna+0.064
 fisher+0.062
 fossil+0.061
Neg logit contributions
chair-0.052
appointed-0.051
 halls-0.050
 roof-0.048
unct-0.046
Token not
Token ID407
Feature activation+7.55e-03

Pos logit contributions
 Wally+0.051
 Sebastian+0.050
 fossil+0.049
 tuna+0.048
 Crab+0.048
Neg logit contributions
appointed-0.037
 halls-0.032
clock-0.030
 floors-0.030
chair-0.030
Token bored
Token ID17533
Feature activation+9.08e-03

Pos logit contributions
 Wally+0.013
 tuna+0.012
 Sebastian+0.012
 shrimp+0.011
 fisher+0.011
Neg logit contributions
appointed-0.012
chair-0.011
unct-0.010
 halls-0.010
irds-0.010
Token wanted
Token ID2227
Feature activation+2.06e-02

Pos logit contributions
 Wally+0.056
 tuna+0.054
 Sebastian+0.052
 fossil+0.052
 fisher+0.051
Neg logit contributions
appointed-0.049
chair-0.049
 halls-0.043
irds-0.042
cases-0.042
Token to
Token ID284
Feature activation+1.68e-02

Pos logit contributions
 Wally+0.039
 fisher+0.039
 Sebastian+0.038
 fossil+0.038
 tuna+0.037
Neg logit contributions
chair-0.025
appointed-0.023
 halls-0.021
clock-0.020
filled-0.020
Token look
Token ID804
Feature activation+3.87e-02

Pos logit contributions
 tuna+0.030
 Wally+0.029
 shrimp+0.028
 fossil+0.028
 Sebastian+0.028
Neg logit contributions
appointed-0.025
chair-0.023
unct-0.020
irds-0.020
 halls-0.020
Token for
Token ID329
Feature activation+3.72e-02

Pos logit contributions
 Wally+0.072
 fisher+0.071
 Sebastian+0.070
 tuna+0.069
 Crab+0.069
Neg logit contributions
chair-0.051
appointed-0.051
unct-0.043
cases-0.043
 halls-0.042
Token a
Token ID257
Feature activation+2.82e-02

Pos logit contributions
 fisher+0.059
 fossil+0.058
 fisherman+0.056
 Wally+0.055
 Sebastian+0.054
Neg logit contributions
chair-0.052
appointed-0.050
 walls-0.046
 halls-0.046
 floors-0.045
Token sport
Token ID6332
Feature activation+9.49e-02

Pos logit contributions
 Wally+0.049
 Sebastian+0.048
 tuna+0.046
 fisher+0.045
 Finn+0.044
Neg logit contributions
appointed-0.038
chair-0.038
 ceiling-0.033
irds-0.032
 roof-0.032
Token he
Token ID339
Feature activation+1.49e-02

Pos logit contributions
 Sebastian+0.174
 fossil+0.163
Stan+0.159
 Wally+0.157
 Otto+0.157
Neg logit contributions
chair-0.128
case-0.093
 exam-0.092
seek-0.092
 stair-0.092
Token could
Token ID714
Feature activation+2.17e-02

Pos logit contributions
 Wally+0.024
 tuna+0.024
 Sebastian+0.023
 shrimp+0.023
 fisher+0.023
Neg logit contributions
appointed-0.024
chair-0.023
irds-0.021
unct-0.021
 halls-0.021
Token do
Token ID466
Feature activation+3.04e-02

Pos logit contributions
 tuna+0.040
 Wally+0.040
 Sebastian+0.039
 shrimp+0.038
 fossil+0.038
Neg logit contributions
appointed-0.031
chair-0.029
unct-0.025
irds-0.025
 halls-0.025
Token.
Token ID13
Feature activation-2.12e-02

Pos logit contributions
 Wally+0.051
 Sebastian+0.050
 tuna+0.050
 fisher+0.048
 shrimp+0.048
Neg logit contributions
appointed-0.047
chair-0.047
 halls-0.042
unct-0.040
cases-0.040
Token Then
Token ID3244
Feature activation-1.57e-02

Pos logit contributions
appointed+0.041
unct+0.040
chair+0.040
irds+0.039
ring+0.038
Neg logit contributions
 Wally-0.027
 Sebastian-0.025
 tuna-0.025
 shrimp-0.023
 fisher-0.023
Token.
Token ID13
Feature activation-3.03e-02

Pos logit contributions
 Wally+0.013
 tuna+0.012
 shrimp+0.012
 Sebastian+0.012
 fisher+0.012
Neg logit contributions
appointed-0.014
chair-0.013
unct-0.012
irds-0.012
 halls-0.012
Token This
Token ID770
Feature activation+2.74e-02

Pos logit contributions
appointed+0.057
unct+0.052
chair+0.050
 halls+0.050
cases+0.048
Neg logit contributions
 Wally-0.046
 tuna-0.044
 shrimp-0.042
 Sebastian-0.042
 Finn-0.041
Token shrimp
Token ID30060
Feature activation+1.10e-02

Pos logit contributions
 Wally+0.052
 fisher+0.052
 Sebastian+0.052
 Finn+0.049
 shrimp+0.047
Neg logit contributions
chair-0.033
appointed-0.029
est-0.028
 roof-0.028
 ceiling-0.027
Token is
Token ID318
Feature activation+6.49e-02

Pos logit contributions
 Wally+0.019
 Sebastian+0.018
 tuna+0.018
 fisher+0.017
 shrimp+0.017
Neg logit contributions
appointed-0.017
chair-0.017
unct-0.015
 halls-0.015
irds-0.015
Token not
Token ID407
Feature activation+3.23e-02

Pos logit contributions
 fisher+0.122
 Sebastian+0.121
 fisherman+0.119
Dave+0.118
 Otto+0.111
Neg logit contributions
appointed-0.062
 same-0.056
clock-0.053
 walls-0.052
 rain-0.051
Token a
Token ID257
Feature activation+9.45e-02

Pos logit contributions
 Wally+0.054
 Sebastian+0.054
 fisher+0.052
 Finn+0.051
 Otto+0.050
Neg logit contributions
appointed-0.044
ong-0.039
ring-0.038
chair-0.038
bows-0.037
Token food
Token ID2057
Feature activation+6.67e-02

Pos logit contributions
 Sebastian+0.165
 Wally+0.155
 Finn+0.150
 Otto+0.148
inn+0.148
Neg logit contributions
 roof-0.105
 same-0.097
 ceiling-0.093
 whole-0.092
 job-0.092
Token.
Token ID13
Feature activation+3.10e-02

Pos logit contributions
 Sebastian+0.124
 fisher+0.119
 Wally+0.118
 fossil+0.117
 Finn+0.113
Neg logit contributions
chair-0.090
 ceiling-0.072
 roof-0.072
case-0.069
clock-0.067
Token It
Token ID632
Feature activation+9.08e-03

Pos logit contributions
 Wally+0.058
 Sebastian+0.056
 fisher+0.056
 tuna+0.056
 shrimp+0.055
Neg logit contributions
chair-0.037
appointed-0.036
ring-0.034
irds-0.034
clock-0.032
Token is
Token ID318
Feature activation+7.73e-02

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 tuna+0.016
 fisher+0.016
 shrimp+0.015
Neg logit contributions
appointed-0.013
chair-0.013
unct-0.011
 halls-0.011
irds-0.011
Token a
Token ID257
Feature activation+9.05e-02

Pos logit contributions
 fisher+0.145
 Sebastian+0.141
Dave+0.139
 fisherman+0.134
 Wally+0.129
Neg logit contributions
appointed-0.075
 same-0.072
 ceiling-0.071
clock-0.070
 roof-0.068
TokenThese
Token ID4711
Feature activation+2.77e-02

Pos logit contributions
 Wally+0.021
 tuna+0.020
 fisher+0.020
 shrimp+0.019
 Sebastian+0.019
Neg logit contributions
appointed-0.019
chair-0.019
irds-0.016
clock-0.016
 halls-0.016
Token are
Token ID389
Feature activation+6.88e-02

Pos logit contributions
 Wally+0.044
 fisher+0.041
 Sebastian+0.040
 Finn+0.040
 tuna+0.040
Neg logit contributions
appointed-0.042
chair-0.039
bows-0.038
unct-0.038
up-0.038
Token peanuts
Token ID45612
Feature activation+1.26e-02

Pos logit contributions
 fisher+0.107
 fisherman+0.100
 Sebastian+0.095
Dave+0.093
 fossil+0.093
Neg logit contributions
appointed-0.084
 walls-0.083
 roof-0.083
clock-0.082
 biggest-0.081
Token.
Token ID13
Feature activation+5.68e-02

Pos logit contributions
 Wally+0.018
 Sebastian+0.017
 tuna+0.017
 fisher+0.017
 shrimp+0.016
Neg logit contributions
chair-0.023
appointed-0.022
bows-0.020
ring-0.020
unct-0.020
Token They
Token ID1119
Feature activation+4.18e-02

Pos logit contributions
 fisher+0.099
 Wally+0.098
 shrimp+0.096
 tuna+0.096
 fisherman+0.094
Neg logit contributions
chair-0.073
appointed-0.071
clock-0.063
irds-0.062
 Lightning-0.061
Token are
Token ID389
Feature activation+9.43e-02

Pos logit contributions
 Wally+0.083
 Sebastian+0.079
 fisher+0.079
 tuna+0.077
 Finn+0.077
Neg logit contributions
appointed-0.048
bows-0.042
unct-0.041
 roof-0.041
chair-0.041
Token interesting
Token ID3499
Feature activation+3.90e-02

Pos logit contributions
 fisher+0.171
Patrick+0.165
Dave+0.161
 fisherman+0.161
inn+0.155
Neg logit contributions
 walls-0.096
 roof-0.096
 biggest-0.090
 pictures-0.087
 same-0.086
Token because
Token ID780
Feature activation+6.17e-02

Pos logit contributions
 Wally+0.062
 Sebastian+0.059
 fisher+0.056
 Otto+0.055
 tuna+0.054
Neg logit contributions
chair-0.061
 halls-0.057
appointed-0.056
irds-0.056
clock-0.055
Token they
Token ID484
Feature activation+4.82e-02

Pos logit contributions
 fisher+0.103
 Wally+0.100
 fossil+0.096
 Finn+0.093
 Otto+0.092
Neg logit contributions
chair-0.081
 roof-0.076
appointed-0.076
 halls-0.074
filled-0.072
Token grow
Token ID1663
Feature activation+1.56e-02

Pos logit contributions
 Wally+0.095
 Sebastian+0.091
 fisher+0.090
 tuna+0.087
 Finn+0.086
Neg logit contributions
appointed-0.054
bows-0.051
unct-0.048
 halls-0.047
chair-0.047
Token underground
Token ID11447
Feature activation+1.35e-02

Pos logit contributions
 Wally+0.027
 Sebastian+0.026
 tuna+0.025
 fisher+0.025
 shrimp+0.024
Neg logit contributions
appointed-0.023
chair-0.021
 halls-0.020
irds-0.020
unct-0.020
Token regular
Token ID3218
Feature activation+3.85e-02

Pos logit contributions
 Sebastian+0.121
 Wally+0.115
 Otto+0.109
 Finn+0.107
 Hank+0.104
Neg logit contributions
clock-0.085
chair-0.083
bye-0.077
 morning-0.075
 job-0.074
Token pizza
Token ID14256
Feature activation+8.57e-02

Pos logit contributions
 Wally+0.058
 Sebastian+0.055
 Otto+0.053
 Finn+0.053
 Fin+0.049
Neg logit contributions
clock-0.065
chair-0.063
ness-0.060
 halls-0.060
est-0.059
Token.
Token ID13
Feature activation+3.59e-02

Pos logit contributions
 Sebastian+0.155
 Wally+0.144
 Otto+0.139
 Crush+0.133
 tuna+0.133
Neg logit contributions
chair-0.113
clock-0.104
bye-0.099
bows-0.096
case-0.092
Token A
Token ID317
Feature activation+3.74e-02

Pos logit contributions
 fossil+0.056
 Sebastian+0.054
 Wally+0.053
 fisher+0.052
 shrimp+0.052
Neg logit contributions
chair-0.054
appointed-0.052
clock-0.048
 halls-0.046
 Everyone-0.045
Token regular
Token ID3218
Feature activation+4.18e-02

Pos logit contributions
 Wally+0.066
 Sebastian+0.065
 Finn+0.062
 Otto+0.061
 Fin+0.059
Neg logit contributions
chair-0.053
 halls-0.047
clock-0.047
appointed-0.047
 roof-0.047
Token pizza
Token ID14256
Feature activation+9.35e-02

Pos logit contributions
 Wally+0.065
 Sebastian+0.063
 Finn+0.061
 Otto+0.060
 Fin+0.058
Neg logit contributions
chair-0.069
ness-0.062
 halls-0.061
 roof-0.060
appointed-0.060
Token has
Token ID468
Feature activation+3.80e-02

Pos logit contributions
 Sebastian+0.179
 Wally+0.164
 Otto+0.164
 Finn+0.154
 Fin+0.152
Neg logit contributions
chair-0.118
clock-0.103
bye-0.099
case-0.099
cakes-0.097
Token cheese
Token ID9891
Feature activation+8.64e-02

Pos logit contributions
 Wally+0.066
 Sebastian+0.063
 Finn+0.060
 Otto+0.060
 fisher+0.060
Neg logit contributions
chair-0.052
clock-0.051
appointed-0.050
 halls-0.049
 ceiling-0.048
Token and
Token ID290
Feature activation+3.30e-02

Pos logit contributions
 Sebastian+0.142
 Wally+0.133
 Finn+0.129
 Otto+0.128
 Fin+0.122
Neg logit contributions
chair-0.122
bows-0.111
cases-0.111
 ceiling-0.110
 walls-0.110
Token sauce
Token ID10746
Feature activation+6.84e-02

Pos logit contributions
 Wally+0.044
 Sebastian+0.043
 fisher+0.040
 Finn+0.039
 Otto+0.038
Neg logit contributions
chair-0.058
appointed-0.057
clock-0.055
 halls-0.054
 ceiling-0.053
Token and
Token ID290
Feature activation+3.51e-02

Pos logit contributions
 Sebastian+0.116
 Wally+0.111
 fisher+0.106
 fossil+0.105
 Otto+0.104
Neg logit contributions
chair-0.096
cases-0.089
bows-0.087
ong-0.085
ring-0.085
Token to
Token ID284
Feature activation-1.44e-02

Pos logit contributions
appointed+0.007
chair+0.006
unct+0.006
irds+0.006
 halls+0.006
Neg logit contributions
 Wally-0.007
 tuna-0.007
 shrimp-0.007
 Sebastian-0.007
 fisher-0.006
Token play
Token ID711
Feature activation+3.36e-02

Pos logit contributions
appointed+0.023
chair+0.022
unct+0.021
irds+0.020
amas+0.019
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.023
Token with
Token ID351
Feature activation+1.13e-02

Pos logit contributions
 Sebastian+0.054
 Wally+0.052
 fossil+0.049
 fisher+0.049
 Crab+0.048
Neg logit contributions
chair-0.050
amas-0.046
 halls-0.046
appointed-0.046
 walls-0.045
Token toys
Token ID14958
Feature activation+4.88e-02

Pos logit contributions
 Wally+0.019
 Sebastian+0.018
 fisher+0.018
 tuna+0.017
 fossil+0.016
Neg logit contributions
chair-0.017
appointed-0.016
 halls-0.016
flash-0.015
 roof-0.015
Token.
Token ID13
Feature activation+2.85e-02

Pos logit contributions
 Sebastian+0.100
 Wally+0.095
 Otto+0.091
 fisher+0.091
 fossil+0.089
Neg logit contributions
chair-0.065
bows-0.053
 ceiling-0.051
 halls-0.051
clock-0.049
Token Ben
Token ID3932
Feature activation+9.35e-02

Pos logit contributions
 Sebastian+0.061
 fossil+0.060
 fisher+0.060
 Wally+0.059
 tuna+0.058
Neg logit contributions
appointed-0.028
chair-0.027
 halls-0.024
clock-0.023
 ceiling-0.023
Token has
Token ID468
Feature activation+7.15e-04

Pos logit contributions
 Sebastian+0.234
 Crab+0.216
 Wally+0.215
 fisher+0.212
 salmon+0.210
Neg logit contributions
's-0.061
chair-0.060
 halls-0.055
 starts-0.045
 sounds-0.043
Token a
Token ID257
Feature activation+3.73e-02

Pos logit contributions
 Wally+0.001
 tuna+0.001
 Sebastian+0.001
 shrimp+0.001
 fisher+0.001
Neg logit contributions
appointed-0.001
chair-0.001
 halls-0.001
unct-0.001
irds-0.001
Token new
Token ID649
Feature activation+3.01e-02

Pos logit contributions
 Wally+0.078
 Sebastian+0.077
 fisher+0.074
 Finn+0.072
 Otto+0.071
Neg logit contributions
chair-0.040
 roof-0.038
 ceiling-0.036
appointed-0.033
 stair-0.032
Token toy
Token ID13373
Feature activation+5.37e-02

Pos logit contributions
 Wally+0.058
 Sebastian+0.055
 fisher+0.054
 Finn+0.054
 Otto+0.053
Neg logit contributions
chair-0.039
 roof-0.037
appointed-0.036
 ceiling-0.034
bows-0.033
Token.
Token ID13
Feature activation+3.71e-02

Pos logit contributions
 Wally+0.100
 Sebastian+0.096
 fisher+0.093
 Otto+0.092
Stan+0.091
Neg logit contributions
chair-0.078
 roof-0.069
 ceiling-0.066
clock-0.060
ring-0.059
Token
Token ID198
Feature activation+1.66e-03

Pos logit contributions
 shrimp+0.030
 Wally+0.030
 tuna+0.029
 fisher+0.028
 Sebastian+0.028
Neg logit contributions
chair-0.031
appointed-0.030
clock-0.027
irds-0.026
 halls-0.026
Token
Token ID198
Feature activation+2.74e-03

Pos logit contributions
 Wally+0.002
 tuna+0.002
 shrimp+0.002
 Sebastian+0.002
 fisher+0.002
Neg logit contributions
appointed-0.003
chair-0.003
unct-0.003
irds-0.003
 halls-0.003
Token"
Token ID1
Feature activation+1.02e-02

Pos logit contributions
 Wally+0.004
 tuna+0.004
 shrimp+0.004
 Sebastian+0.004
 fisher+0.004
Neg logit contributions
appointed-0.005
chair-0.005
irds-0.005
 halls-0.004
unct-0.004
TokenBecause
Token ID8128
Feature activation+4.57e-02

Pos logit contributions
 Wally+0.017
 tuna+0.017
 shrimp+0.016
 Sebastian+0.016
 fisher+0.015
Neg logit contributions
appointed-0.017
chair-0.016
clock-0.015
irds-0.014
 halls-0.014
Token tanks
Token ID11657
Feature activation+4.78e-02

Pos logit contributions
 Wally+0.081
 Sebastian+0.077
 fisher+0.074
 fossil+0.072
 Crab+0.072
Neg logit contributions
chair-0.061
 halls-0.056
 ceiling-0.054
 roof-0.054
appointed-0.053
Token are
Token ID389
Feature activation+9.35e-02

Pos logit contributions
 Wally+0.095
 Sebastian+0.095
 fisher+0.088
 Crab+0.088
 Otto+0.088
Neg logit contributions
chair-0.053
appointed-0.051
ong-0.049
bows-0.046
 halls-0.046
Token loud
Token ID7812
Feature activation+1.65e-02

Pos logit contributions
Dave+0.177
 Sebastian+0.175
 Crab+0.171
Stan+0.169
 fisherman+0.166
Neg logit contributions
 walls-0.086
clock-0.084
 sounds-0.082
 noises-0.079
 roof-0.077
Token and
Token ID290
Feature activation+3.37e-02

Pos logit contributions
 Wally+0.021
 Sebastian+0.019
 tuna+0.019
 shrimp+0.019
 fisher+0.019
Neg logit contributions
chair-0.033
appointed-0.033
bows-0.029
 halls-0.029
ring-0.029
Token scary
Token ID14343
Feature activation+1.47e-02

Pos logit contributions
 Wally+0.061
 Sebastian+0.061
 Crab+0.057
 fisher+0.056
 shrimp+0.055
Neg logit contributions
appointed-0.046
chair-0.042
 halls-0.041
clock-0.040
flash-0.038
Token,"
Token ID553
Feature activation+6.95e-03

Pos logit contributions
 Wally+0.024
 Sebastian+0.023
 tuna+0.022
 shrimp+0.022
 fisher+0.022
Neg logit contributions
chair-0.024
appointed-0.023
 halls-0.021
clock-0.021
irds-0.021
Token Mia
Token ID32189
Feature activation-3.87e-03

Pos logit contributions
 Wally+0.010
 tuna+0.009
 shrimp+0.009
 Sebastian+0.009
 fisher+0.009
Neg logit contributions
appointed-0.012
chair-0.012
unct-0.011
 halls-0.011
irds-0.011
Token and
Token ID290
Feature activation+1.32e-02

Pos logit contributions
 Sebastian+0.051
 Wally+0.051
 fisher+0.045
 tuna+0.045
 Otto+0.045
Neg logit contributions
chair-0.056
appointed-0.053
 halls-0.052
 ceiling-0.052
iece-0.051
Token look
Token ID804
Feature activation+2.80e-02

Pos logit contributions
 Wally+0.023
 Sebastian+0.022
 tuna+0.022
 fisher+0.022
 shrimp+0.021
Neg logit contributions
appointed-0.019
chair-0.018
 halls-0.017
unct-0.017
 roof-0.017
Token for
Token ID329
Feature activation+3.30e-02

Pos logit contributions
 Wally+0.050
 Sebastian+0.048
 fisher+0.046
 Crab+0.046
 tuna+0.046
Neg logit contributions
chair-0.039
appointed-0.038
 halls-0.037
unct-0.036
ong-0.035
Token bugs
Token ID11316
Feature activation+4.97e-02

Pos logit contributions
 Wally+0.061
 Sebastian+0.058
 fisher+0.055
 Otto+0.053
 fossil+0.052
Neg logit contributions
 halls-0.041
 walls-0.041
 floors-0.040
 roof-0.040
 ceiling-0.039
Token.
Token ID13
Feature activation+2.86e-02

Pos logit contributions
 Sebastian+0.099
 Wally+0.092
 fisher+0.090
 Otto+0.087
 fossil+0.086
Neg logit contributions
chair-0.064
bows-0.056
 halls-0.055
ring-0.055
als-0.054
Token Ben
Token ID3932
Feature activation+9.34e-02

Pos logit contributions
 fossil+0.047
 fisher+0.045
 tuna+0.044
 Wally+0.044
 Sebastian+0.044
Neg logit contributions
appointed-0.043
chair-0.039
irds-0.037
 halls-0.037
rolling-0.035
Token finds
Token ID7228
Feature activation+2.08e-02

Pos logit contributions
 Sebastian+0.212
 Wally+0.189
 Crab+0.188
 fisher+0.186
 Crush+0.185
Neg logit contributions
's-0.080
chair-0.080
 halls-0.079
 starts-0.070
 sounds-0.068
Token a
Token ID257
Feature activation+4.80e-02

Pos logit contributions
 Wally+0.036
 Sebastian+0.035
 fisher+0.035
 tuna+0.033
 fossil+0.033
Neg logit contributions
chair-0.027
appointed-0.026
 halls-0.025
irds-0.025
iece-0.024
Token big
Token ID1263
Feature activation+5.94e-02

Pos logit contributions
 Wally+0.097
 Sebastian+0.097
 fisher+0.090
 Fin+0.087
 Finn+0.087
Neg logit contributions
 roof-0.054
 ceiling-0.052
chair-0.048
 stair-0.047
appointed-0.045
Token worm
Token ID23849
Feature activation+6.27e-02

Pos logit contributions
 Sebastian+0.113
 Wally+0.112
 fisher+0.105
 Finn+0.102
 Otto+0.102
Neg logit contributions
 roof-0.075
 ceiling-0.073
 stair-0.070
 wall-0.068
chair-0.068
Token and
Token ID290
Feature activation+3.48e-02

Pos logit contributions
 Sebastian+0.115
 fisher+0.112
 Wally+0.106
 Fin+0.098
 Finn+0.098
Neg logit contributions
chair-0.094
age-0.081
case-0.079
 roof-0.079
 ceiling-0.077
Token.
Token ID13
Feature activation-1.61e-04

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 tuna+0.016
 fisher+0.016
 shrimp+0.016
Neg logit contributions
appointed-0.018
chair-0.018
 halls-0.016
unct-0.016
clock-0.016
Token He
Token ID679
Feature activation+1.73e-02

Pos logit contributions
appointed+0.000
chair+0.000
unct+0.000
irds+0.000
 halls+0.000
Neg logit contributions
 Wally-0.000
 tuna-0.000
 shrimp-0.000
 Sebastian-0.000
 fisher-0.000
Token found
Token ID1043
Feature activation+7.94e-03

Pos logit contributions
 Wally+0.029
 Sebastian+0.028
 fisher+0.028
 tuna+0.028
 shrimp+0.027
Neg logit contributions
appointed-0.026
chair-0.026
 halls-0.024
unct-0.023
irds-0.022
Token some
Token ID617
Feature activation+5.17e-02

Pos logit contributions
 Wally+0.013
 fisher+0.012
 Sebastian+0.012
 tuna+0.012
 shrimp+0.012
Neg logit contributions
appointed-0.012
chair-0.012
unct-0.011
 halls-0.011
irds-0.011
Token plastic
Token ID7309
Feature activation+7.59e-02

Pos logit contributions
 Sebastian+0.100
 Wally+0.096
 fisher+0.095
 Crab+0.090
 Finn+0.090
Neg logit contributions
chair-0.056
appointed-0.055
 walls-0.054
 roof-0.054
 halls-0.053
Token containers
Token ID16472
Feature activation+9.34e-02

Pos logit contributions
 Sebastian+0.132
 Wally+0.125
 fisher+0.114
Dave+0.114
Stan+0.112
Neg logit contributions
 walls-0.103
als-0.103
 roof-0.102
 ceiling-0.098
chair-0.097
Token.
Token ID13
Feature activation+1.81e-02

Pos logit contributions
 Sebastian+0.183
 fisher+0.181
 Wally+0.173
 fossil+0.172
 Finn+0.168
Neg logit contributions
chair-0.106
cakes-0.091
case-0.089
 walls-0.089
als-0.088
Token He
Token ID679
Feature activation+3.31e-02

Pos logit contributions
 tuna+0.026
 shrimp+0.026
 fisher+0.026
 Sebastian+0.026
 Wally+0.026
Neg logit contributions
appointed-0.031
chair-0.029
irds-0.027
 halls-0.026
amas-0.025
Token thought
Token ID1807
Feature activation+3.94e-02

Pos logit contributions
 Wally+0.057
 Sebastian+0.057
 fisher+0.054
 tuna+0.052
 shrimp+0.052
Neg logit contributions
appointed-0.047
chair-0.045
 halls-0.044
unct-0.040
cases-0.039
Token they
Token ID484
Feature activation+9.70e-03

Pos logit contributions
 Sebastian+0.063
 fisher+0.062
 Wally+0.059
 fossil+0.058
 tuna+0.056
Neg logit contributions
chair-0.060
 roof-0.052
 halls-0.052
ong-0.052
appointed-0.051
Token looked
Token ID3114
Feature activation+6.36e-02

Pos logit contributions
 Wally+0.017
 tuna+0.016
 fisher+0.016
 Sebastian+0.016
 shrimp+0.016
Neg logit contributions
appointed-0.015
chair-0.014
unct-0.013
 halls-0.013
irds-0.013
Token
Token ID198
Feature activation-3.92e-03

Pos logit contributions
 Wally+0.000
 tuna+0.000
 shrimp+0.000
 Sebastian+0.000
 fisher+0.000
Neg logit contributions
appointed-0.000
chair-0.000
unct-0.000
irds-0.000
 halls-0.000
Token"
Token ID1
Feature activation+4.85e-03

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.005
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
TokenA
Token ID32
Feature activation+3.07e-02

Pos logit contributions
 Wally+0.008
 tuna+0.008
 Sebastian+0.007
 shrimp+0.007
 fisher+0.007
Neg logit contributions
appointed-0.008
chair-0.008
irds-0.007
unct-0.007
 halls-0.007
Token mineral
Token ID18352
Feature activation+4.28e-02

Pos logit contributions
 Wally+0.042
 Sebastian+0.038
 Finn+0.035
 fisher+0.034
 tuna+0.034
Neg logit contributions
chair-0.055
appointed-0.054
unct-0.050
est-0.049
ness-0.048
Token is
Token ID318
Feature activation+6.45e-02

Pos logit contributions
 Wally+0.073
 Sebastian+0.072
 fisher+0.067
 Otto+0.064
 Finn+0.064
Neg logit contributions
chair-0.066
unct-0.056
appointed-0.055
 ceiling-0.054
up-0.054
Token a
Token ID257
Feature activation+9.33e-02

Pos logit contributions
 fisher+0.125
 Wally+0.123
 Sebastian+0.121
 fisherman+0.121
Dave+0.118
Neg logit contributions
 ceiling-0.062
 walls-0.061
 roof-0.056
ong-0.056
appointed-0.055
Token kind
Token ID1611
Feature activation+6.18e-02

Pos logit contributions
 Wally+0.150
 Sebastian+0.142
 fisher+0.136
Stan+0.132
 Fin+0.131
Neg logit contributions
 ceiling-0.134
 roof-0.128
 wall-0.123
 biggest-0.116
 top-0.113
Token of
Token ID286
Feature activation+5.93e-02

Pos logit contributions
 Wally+0.116
 Sebastian+0.105
 Finn+0.105
 Otto+0.103
Stan+0.102
Neg logit contributions
est-0.086
ness-0.084
chair-0.075
 ceiling-0.072
ly-0.070
Token rock
Token ID3881
Feature activation+1.18e-02

Pos logit contributions
 Wally+0.091
 fisher+0.088
 Sebastian+0.088
Stan+0.085
 fisherman+0.080
Neg logit contributions
ness-0.088
 ceiling-0.087
chair-0.084
appointed-0.083
 walls-0.082
Token that
Token ID326
Feature activation+1.03e-02

Pos logit contributions
 Wally+0.019
 Sebastian+0.018
 tuna+0.018
 fisher+0.017
 shrimp+0.017
Neg logit contributions
chair-0.020
appointed-0.019
unct-0.018
ring-0.017
irds-0.017
Token comes
Token ID2058
Feature activation-1.43e-02

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 tuna+0.016
 fisher+0.016
 shrimp+0.016
Neg logit contributions
appointed-0.016
chair-0.016
unct-0.014
 halls-0.014
irds-0.014
Tokenily
Token ID813
Feature activation-1.59e-03

Pos logit contributions
appointed+0.006
chair+0.005
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.005
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token offered
Token ID4438
Feature activation-9.43e-03

Pos logit contributions
appointed+0.003
chair+0.003
unct+0.003
irds+0.003
 halls+0.003
Neg logit contributions
 Wally-0.002
 tuna-0.002
 shrimp-0.002
 Sebastian-0.002
 fisher-0.002
Token Tim
Token ID5045
Feature activation+2.45e-03

Pos logit contributions
appointed+0.016
chair+0.016
unct+0.014
irds+0.014
 halls+0.014
Neg logit contributions
 Wally-0.015
 tuna-0.015
 shrimp-0.014
 Sebastian-0.014
 fisher-0.014
Tokenmy
Token ID1820
Feature activation+1.01e-02

Pos logit contributions
 Wally+0.004
 tuna+0.004
 Sebastian+0.004
 shrimp+0.004
 fisher+0.004
Neg logit contributions
appointed-0.004
chair-0.003
unct-0.003
irds-0.003
 halls-0.003
Token some
Token ID617
Feature activation+2.73e-02

Pos logit contributions
 Wally+0.016
 Sebastian+0.015
 tuna+0.015
 fisher+0.015
 shrimp+0.014
Neg logit contributions
appointed-0.017
chair-0.016
unct-0.015
irds-0.015
 halls-0.014
Token food
Token ID2057
Feature activation+9.31e-02

Pos logit contributions
 Wally+0.045
 Sebastian+0.044
 fisher+0.041
 Finn+0.040
 tuna+0.039
Neg logit contributions
chair-0.043
appointed-0.040
 halls-0.037
unct-0.037
irds-0.036
Token,
Token ID11
Feature activation+5.50e-02

Pos logit contributions
 fisher+0.160
 Sebastian+0.159
 fossil+0.154
 Wally+0.149
 fisherman+0.139
Neg logit contributions
chair-0.120
cakes-0.104
ring-0.101
age-0.101
bows-0.100
Token but
Token ID475
Feature activation+4.45e-03

Pos logit contributions
 fisher+0.086
 fossil+0.083
 Sebastian+0.079
 Wally+0.078
 shark+0.076
Neg logit contributions
chair-0.084
appointed-0.082
bars-0.074
 roof-0.074
irds-0.072
Token he
Token ID339
Feature activation+3.32e-02

Pos logit contributions
 Wally+0.007
 tuna+0.007
 fisher+0.007
 shrimp+0.007
 Sebastian+0.006
Neg logit contributions
appointed-0.008
chair-0.007
unct-0.007
irds-0.007
 halls-0.006
Token said
Token ID531
Feature activation+1.84e-02

Pos logit contributions
 Wally+0.060
 fisher+0.059
 tuna+0.059
 shrimp+0.058
 Sebastian+0.058
Neg logit contributions
appointed-0.044
chair-0.042
unct-0.038
owed-0.035
up-0.035
Token no
Token ID645
Feature activation-5.78e-03

Pos logit contributions
 Wally+0.026
 tuna+0.026
 fisher+0.026
 Sebastian+0.025
 fossil+0.025
Neg logit contributions
appointed-0.031
chair-0.031
unct-0.028
irds-0.028
ring-0.027
Token,
Token ID11
Feature activation+2.56e-02

Pos logit contributions
 Wally+0.007
 tuna+0.007
 Sebastian+0.006
 shrimp+0.006
 fisher+0.006
Neg logit contributions
appointed-0.009
chair-0.009
unct-0.009
irds-0.008
ring-0.008
Token "
Token ID366
Feature activation+9.37e-03

Pos logit contributions
 Wally+0.042
 tuna+0.042
 fisher+0.040
 shrimp+0.040
 Sebastian+0.039
Neg logit contributions
appointed-0.042
chair-0.038
ring-0.035
amas-0.035
unct-0.035
TokenIt
Token ID1026
Feature activation+2.39e-02

Pos logit contributions
 Wally+0.016
 tuna+0.016
 shrimp+0.015
 fisher+0.015
 Sebastian+0.015
Neg logit contributions
appointed-0.015
chair-0.014
irds-0.013
ring-0.013
unct-0.013
Token's
Token ID338
Feature activation+5.21e-02

Pos logit contributions
 Wally+0.056
 Sebastian+0.054
 fisher+0.053
 shrimp+0.053
 tuna+0.053
Neg logit contributions
appointed-0.020
chair-0.018
unct-0.015
irds-0.014
ring-0.014
Token a
Token ID257
Feature activation+7.37e-02

Pos logit contributions
 fisher+0.094
 Wally+0.090
 fisherman+0.087
 Sebastian+0.086
 Finn+0.084
Neg logit contributions
appointed-0.063
clock-0.056
 ceiling-0.050
 walls-0.050
ness-0.049
Token tuna
Token ID38883
Feature activation+9.31e-02

Pos logit contributions
 Wally+0.111
 Sebastian+0.107
 Finn+0.101
 Fin+0.098
 fisher+0.097
Neg logit contributions
 whole-0.095
 ceiling-0.092
appointed-0.091
 same-0.091
 biggest-0.090
Token,
Token ID11
Feature activation+7.62e-02

Pos logit contributions
 Sebastian+0.130
 Wally+0.121
 Fin+0.110
 Otto+0.105
Dave+0.104
Neg logit contributions
chair-0.133
 ceiling-0.130
ness-0.129
als-0.129
est-0.128
Token and
Token ID290
Feature activation+4.98e-02

Pos logit contributions
 fisher+0.112
 Wally+0.102
 Sebastian+0.098
inn+0.097
 fossil+0.097
Neg logit contributions
chair-0.104
 roof-0.099
 ceiling-0.099
irds-0.096
ring-0.096
Token we
Token ID356
Feature activation+6.61e-02

Pos logit contributions
 Wally+0.081
 Sebastian+0.079
 fisher+0.079
 Finn+0.072
 Otto+0.072
Neg logit contributions
chair-0.065
appointed-0.064
 ceiling-0.059
 roof-0.059
irds-0.058
Token can
Token ID460
Feature activation+6.36e-02

Pos logit contributions
 Wally+0.126
 tuna+0.117
 Sebastian+0.116
 fisher+0.115
 Otto+0.113
Neg logit contributions
appointed-0.073
ring-0.070
 roof-0.068
unct-0.065
 ceiling-0.063
Token catch
Token ID4929
Feature activation+3.72e-02

Pos logit contributions
 fossil+0.117
 Wally+0.116
 chefs+0.115
 tuna+0.114
 fisher+0.114
Neg logit contributions
appointed-0.064
rolling-0.057
 doorstep-0.055
 roof-0.055
 top-0.055
Token family
Token ID1641
Feature activation+1.87e-02

Pos logit contributions
appointed+0.061
irds+0.057
unct+0.054
chair+0.054
ring+0.053
Neg logit contributions
 tuna-0.053
 Wally-0.052
 shrimp-0.051
 Sebastian-0.046
 fisher-0.046
Token.
Token ID13
Feature activation+1.48e-02

Pos logit contributions
 Wally+0.031
 tuna+0.029
 fisher+0.028
 Sebastian+0.028
 shrimp+0.027
Neg logit contributions
chair-0.030
appointed-0.030
 halls-0.027
ring-0.027
clock-0.026
Token He
Token ID679
Feature activation+3.11e-02

Pos logit contributions
 Wally+0.025
 tuna+0.024
 fisher+0.023
 shrimp+0.023
 fossil+0.022
Neg logit contributions
appointed-0.025
chair-0.023
 halls-0.021
clock-0.021
amas-0.021
Token had
Token ID550
Feature activation-1.26e-03

Pos logit contributions
 Wally+0.055
 tuna+0.052
 fisher+0.051
 Sebastian+0.051
 shrimp+0.050
Neg logit contributions
chair-0.047
appointed-0.044
 halls-0.042
cases-0.040
unct-0.039
Token a
Token ID257
Feature activation+4.77e-02

Pos logit contributions
appointed+0.002
chair+0.002
unct+0.002
 halls+0.002
irds+0.002
Neg logit contributions
 Wally-0.002
 tuna-0.002
 shrimp-0.002
 Sebastian-0.002
 fisher-0.002
Token tuna
Token ID38883
Feature activation+9.30e-02

Pos logit contributions
 Wally+0.101
 Sebastian+0.095
 Finn+0.094
 fisher+0.094
 Otto+0.092
Neg logit contributions
 roof-0.048
chair-0.048
 ceiling-0.043
appointed-0.043
 stair-0.041
Token!
Token ID0
Feature activation+8.78e-03

Pos logit contributions
 Sebastian+0.139
 Wally+0.135
 fisher+0.126
 Fin+0.124
 Otto+0.123
Neg logit contributions
chair-0.124
als-0.115
 roof-0.113
 ceiling-0.113
ail-0.112
Token He
Token ID679
Feature activation+6.44e-02

Pos logit contributions
 Wally+0.015
 tuna+0.014
 shrimp+0.014
 fisher+0.014
 Sebastian+0.013
Neg logit contributions
appointed-0.015
chair-0.014
amas-0.012
irds-0.012
 halls-0.012
Token wanted
Token ID2227
Feature activation+5.08e-02

Pos logit contributions
 Wally+0.124
 fisher+0.121
 Sebastian+0.120
 tuna+0.112
 fossil+0.111
Neg logit contributions
chair-0.072
 halls-0.065
appointed-0.063
bell-0.058
cases-0.058
Token to
Token ID284
Feature activation+5.22e-02

Pos logit contributions
 fisher+0.104
 Sebastian+0.101
 Wally+0.100
 fisherman+0.098
 fossil+0.097
Neg logit contributions
clock-0.036
chair-0.034
 halls-0.034
filled-0.033
up-0.032
Token show
Token ID905
Feature activation+1.29e-02

Pos logit contributions
 fisher+0.103
 Wally+0.103
 fossil+0.101
 tuna+0.098
 Sebastian+0.098
Neg logit contributions
appointed-0.049
 doorstep-0.047
 roof-0.043
 venture-0.042
 wander-0.041
Token basket
Token ID7988
Feature activation-6.29e-03

Pos logit contributions
appointed+0.003
chair+0.003
unct+0.002
 halls+0.002
irds+0.002
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 fisher-0.003
 shrimp-0.003
Token,
Token ID11
Feature activation+6.62e-03

Pos logit contributions
appointed+0.012
chair+0.012
unct+0.011
irds+0.011
 halls+0.011
Neg logit contributions
 Wally-0.009
 tuna-0.008
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token Jack
Token ID3619
Feature activation+2.67e-02

Pos logit contributions
 Wally+0.008
 tuna+0.008
 fisher+0.008
 Sebastian+0.008
 shrimp+0.007
Neg logit contributions
appointed-0.013
chair-0.013
 halls-0.012
unct-0.012
irds-0.012
Token found
Token ID1043
Feature activation+2.15e-02

Pos logit contributions
 Wally+0.041
 tuna+0.038
 fisher+0.038
 Sebastian+0.038
 shrimp+0.036
Neg logit contributions
chair-0.044
appointed-0.044
 halls-0.041
unct-0.040
irds-0.039
Token delicious
Token ID12625
Feature activation+3.33e-02

Pos logit contributions
 fisher+0.038
 Wally+0.038
 Sebastian+0.035
 tuna+0.035
 fossil+0.035
Neg logit contributions
chair-0.028
appointed-0.027
 halls-0.025
cases-0.024
 ceiling-0.024
Token sandwiches
Token ID29777
Feature activation+9.28e-02

Pos logit contributions
 Wally+0.062
 fisher+0.060
 Sebastian+0.059
 Finn+0.057
 Fin+0.057
Neg logit contributions
chair-0.041
appointed-0.038
 halls-0.038
bows-0.036
 walls-0.036
Token,
Token ID11
Feature activation+4.85e-02

Pos logit contributions
 fisher+0.180
 fossil+0.169
 Sebastian+0.169
 Crush+0.162
Fin+0.159
Neg logit contributions
chair-0.112
cakes-0.092
als-0.091
case-0.091
cases-0.089
Token apples
Token ID22514
Feature activation+7.03e-02

Pos logit contributions
 fisher+0.079
 Sebastian+0.076
 Wally+0.075
 fossil+0.073
 Finn+0.072
Neg logit contributions
chair-0.069
appointed-0.065
 walls-0.064
cases-0.064
 halls-0.063
Token,
Token ID11
Feature activation+4.44e-02

Pos logit contributions
 fisher+0.128
 fossil+0.120
 Sebastian+0.120
 Finn+0.117
 Wally+0.116
Neg logit contributions
chair-0.081
ics-0.078
 roof-0.075
als-0.074
cases-0.074
Token and
Token ID290
Feature activation+1.02e-02

Pos logit contributions
 Sebastian+0.062
 fisher+0.061
 Wally+0.061
 Finn+0.059
 fossil+0.058
Neg logit contributions
chair-0.073
appointed-0.068
cases-0.068
 walls-0.068
 halls-0.068
Token oranges
Token ID48389
Feature activation+4.82e-02

Pos logit contributions
 Wally+0.015
 Sebastian+0.015
 fisher+0.015
 tuna+0.014
 shrimp+0.014
Neg logit contributions
appointed-0.017
chair-0.017
 halls-0.016
unct-0.015
irds-0.015
Token Tim
Token ID5045
Feature activation-1.42e-02

Pos logit contributions
appointed+0.011
chair+0.011
unct+0.010
irds+0.010
 halls+0.010
Neg logit contributions
 Wally-0.008
 tuna-0.008
 shrimp-0.007
 Sebastian-0.007
 fisher-0.007
Token wanted
Token ID2227
Feature activation+1.08e-02

Pos logit contributions
appointed+0.021
chair+0.020
irds+0.018
unct+0.018
 halls+0.017
Neg logit contributions
 Wally-0.026
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.024
Token to
Token ID284
Feature activation-1.36e-02

Pos logit contributions
 Wally+0.020
 fisher+0.019
 Sebastian+0.019
 tuna+0.019
 shrimp+0.019
Neg logit contributions
chair-0.014
appointed-0.014
 halls-0.012
unct-0.012
clock-0.012
Token find
Token ID1064
Feature activation+3.27e-02

Pos logit contributions
appointed+0.021
chair+0.020
unct+0.019
irds+0.018
 halls+0.018
Neg logit contributions
 Wally-0.024
 tuna-0.023
 shrimp-0.023
 Sebastian-0.022
 fisher-0.022
Token some
Token ID617
Feature activation+5.08e-02

Pos logit contributions
 Wally+0.056
 fisher+0.055
 Sebastian+0.052
 fossil+0.051
 shrimp+0.050
Neg logit contributions
chair-0.044
up-0.040
appointed-0.039
flash-0.038
 Lightning-0.038
Token food
Token ID2057
Feature activation+9.24e-02

Pos logit contributions
 Wally+0.092
 Sebastian+0.092
 fisher+0.089
 Crab+0.086
 Finn+0.085
Neg logit contributions
chair-0.066
 roof-0.057
 ceiling-0.056
 halls-0.055
appointed-0.055
Token,
Token ID11
Feature activation+4.06e-02

Pos logit contributions
 fisher+0.179
 Sebastian+0.171
 fossil+0.171
 Crush+0.160
 Wally+0.159
Neg logit contributions
chair-0.095
cases-0.078
case-0.077
 roof-0.074
ics-0.074
Token so
Token ID523
Feature activation-2.37e-02

Pos logit contributions
 fisher+0.071
 Wally+0.070
 fossil+0.069
 Sebastian+0.066
 fisherman+0.064
Neg logit contributions
chair-0.056
appointed-0.053
 roof-0.049
cases-0.049
 walls-0.048
Token he
Token ID339
Feature activation+6.66e-03

Pos logit contributions
appointed+0.060
chair+0.058
unct+0.057
amas+0.056
ring+0.056
Neg logit contributions
 Wally-0.018
 tuna-0.017
 Sebastian-0.016
 shrimp-0.016
 Fin-0.014
Token hopped
Token ID45230
Feature activation+3.99e-02

Pos logit contributions
 Wally+0.011
 tuna+0.010
 Sebastian+0.010
 fisher+0.010
 shrimp+0.010
Neg logit contributions
appointed-0.011
chair-0.011
unct-0.010
 halls-0.010
irds-0.010
Token around
Token ID1088
Feature activation+4.89e-02

Pos logit contributions
 fisher+0.081
 fossil+0.078
 tuna+0.078
 Sebastian+0.077
 Wally+0.077
Neg logit contributions
appointed-0.047
chair-0.037
filled-0.036
unct-0.035
 halls-0.034
Token the
Token ID262
Feature activation-2.77e-03

Pos logit contributions
appointed+0.017
chair+0.016
unct+0.015
irds+0.015
 halls+0.015
Neg logit contributions
 Wally-0.013
 tuna-0.012
 Sebastian-0.012
 shrimp-0.012
 fisher-0.011
Token sky
Token ID6766
Feature activation-5.45e-02

Pos logit contributions
appointed+0.004
chair+0.004
unct+0.003
irds+0.003
 halls+0.003
Neg logit contributions
 Wally-0.005
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token fair
Token ID3148
Feature activation-3.77e-02

Pos logit contributions
appointed+0.116
chair+0.109
unct+0.109
 halls+0.105
irds+0.105
Neg logit contributions
 tuna-0.059
 Wally-0.058
 shrimp-0.056
 Sebastian-0.051
 like-0.050
Tokenies
Token ID444
Feature activation-1.41e-02

Pos logit contributions
appointed+0.086
unct+0.080
chair+0.080
flash+0.077
 halls+0.076
Neg logit contributions
 tuna-0.038
 Wally-0.038
 shrimp-0.037
 fisher-0.034
 Sebastian-0.033
Token spinning
Token ID19493
Feature activation-2.25e-02

Pos logit contributions
appointed+0.028
chair+0.027
unct+0.025
irds+0.025
 halls+0.025
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.017
 Sebastian-0.017
 fisher-0.017
Token
Token ID220
Feature activation-1.35e-01

Pos logit contributions
appointed+0.044
chair+0.042
unct+0.040
irds+0.039
 halls+0.038
Neg logit contributions
 Wally-0.030
 tuna-0.029
 shrimp-0.028
 Sebastian-0.027
 fisher-0.026
Token their
Token ID511
Feature activation-3.70e-02

Pos logit contributions
unct+0.218
 forehead+0.194
 floors+0.193
 cheeks+0.193
 halls+0.193
Neg logit contributions
 tuna-0.171
 Wally-0.170
inn-0.168
 Finn-0.163
 Sebastian-0.155
Token magical
Token ID10883
Feature activation-1.50e-02

Pos logit contributions
appointed+0.061
chair+0.058
unct+0.053
rays+0.050
amas+0.050
Neg logit contributions
 Wally-0.061
 shrimp-0.061
 tuna-0.061
 fisher-0.056
 Sebastian-0.055
Token
Token ID220
Feature activation-1.33e-01

Pos logit contributions
appointed+0.022
chair+0.021
unct+0.020
irds+0.019
flash+0.018
Neg logit contributions
 Wally-0.027
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.025
Token club
Token ID3430
Feature activation-5.95e-02

Pos logit contributions
unct+0.208
 halls+0.188
chair+0.187
appointed+0.181
 floors+0.181
Neg logit contributions
inn-0.170
 Wally-0.169
 tuna-0.165
 Finn-0.165
 Sebastian-0.160
Token."
Token ID22135
Feature activation-3.66e-02

Pos logit contributions
appointed+0.127
unct+0.120
chair+0.114
irds+0.113
flash+0.112
Neg logit contributions
 Wally-0.065
 tuna-0.063
 shrimp-0.059
 fisher-0.056
 fossil-0.053
Tokenies
Token ID444
Feature activation-1.41e-02

Pos logit contributions
appointed+0.086
unct+0.080
chair+0.080
flash+0.077
 halls+0.076
Neg logit contributions
 tuna-0.038
 Wally-0.038
 shrimp-0.037
 fisher-0.034
 Sebastian-0.033
Token spinning
Token ID19493
Feature activation-2.25e-02

Pos logit contributions
appointed+0.028
chair+0.027
unct+0.025
irds+0.025
 halls+0.025
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.017
 Sebastian-0.017
 fisher-0.017
Token
Token ID220
Feature activation-1.35e-01

Pos logit contributions
appointed+0.044
chair+0.042
unct+0.040
irds+0.039
 halls+0.038
Neg logit contributions
 Wally-0.030
 tuna-0.029
 shrimp-0.028
 Sebastian-0.027
 fisher-0.026
Token their
Token ID511
Feature activation-3.70e-02

Pos logit contributions
unct+0.218
 forehead+0.194
 floors+0.193
 cheeks+0.193
 halls+0.193
Neg logit contributions
 tuna-0.171
 Wally-0.170
inn-0.168
 Finn-0.163
 Sebastian-0.155
Token magical
Token ID10883
Feature activation-1.50e-02

Pos logit contributions
appointed+0.061
chair+0.058
unct+0.053
rays+0.050
amas+0.050
Neg logit contributions
 Wally-0.061
 shrimp-0.061
 tuna-0.061
 fisher-0.056
 Sebastian-0.055
Token
Token ID220
Feature activation-1.33e-01

Pos logit contributions
appointed+0.022
chair+0.021
unct+0.020
irds+0.019
flash+0.018
Neg logit contributions
 Wally-0.027
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.025
Token club
Token ID3430
Feature activation-5.95e-02

Pos logit contributions
unct+0.208
 halls+0.188
chair+0.187
appointed+0.181
 floors+0.181
Neg logit contributions
inn-0.170
 Wally-0.169
 tuna-0.165
 Finn-0.165
 Sebastian-0.160
Token."
Token ID22135
Feature activation-3.66e-02

Pos logit contributions
appointed+0.127
unct+0.120
chair+0.114
irds+0.113
flash+0.112
Neg logit contributions
 Wally-0.065
 tuna-0.063
 shrimp-0.059
 fisher-0.056
 fossil-0.053
Token
Token ID198
Feature activation-2.31e-02

Pos logit contributions
appointed+0.074
unct+0.073
chair+0.070
 halls+0.069
irds+0.068
Neg logit contributions
 Wally-0.044
 tuna-0.040
 Sebastian-0.038
 shrimp-0.035
 Fin-0.035
Token
Token ID198
Feature activation-2.29e-02

Pos logit contributions
appointed+0.054
chair+0.051
unct+0.051
ring+0.050
 halls+0.049
Neg logit contributions
 Wally-0.022
 tuna-0.020
 Sebastian-0.019
 shrimp-0.018
 Finn-0.017
TokenSarah
Token ID29284
Feature activation-2.44e-02

Pos logit contributions
appointed+0.038
unct+0.038
chair+0.038
ring+0.035
irds+0.035
Neg logit contributions
 Wally-0.034
 tuna-0.032
 Sebastian-0.031
 shrimp-0.031
 fisher-0.030
Token in
Token ID287
Feature activation+6.39e-03

Pos logit contributions
appointed+0.120
irds+0.109
unct+0.108
chair+0.107
EEP+0.106
Neg logit contributions
 tuna-0.059
 Wally-0.057
 shrimp-0.053
 fossil-0.049
 Sebastian-0.047
Token a
Token ID257
Feature activation-1.39e-02

Pos logit contributions
 Wally+0.011
 fisher+0.010
 tuna+0.010
 Sebastian+0.010
 shrimp+0.010
Neg logit contributions
appointed-0.010
chair-0.010
 halls-0.009
unct-0.009
irds-0.009
Token world
Token ID995
Feature activation-3.11e-02

Pos logit contributions
appointed+0.022
chair+0.021
irds+0.020
unct+0.020
ring+0.019
Neg logit contributions
 Wally-0.024
 tuna-0.023
 shrimp-0.022
 fisher-0.022
 Sebastian-0.022
Token of
Token ID286
Feature activation-4.08e-03

Pos logit contributions
appointed+0.068
unct+0.065
irds+0.063
amas+0.061
flash+0.060
Neg logit contributions
 Wally-0.034
 tuna-0.033
 shrimp-0.031
 Sebastian-0.029
 fisher-0.028
Token make
Token ID787
Feature activation-9.00e-02

Pos logit contributions
appointed+0.006
chair+0.006
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.007
 tuna-0.007
 shrimp-0.007
 Sebastian-0.007
 fisher-0.007
Token-
Token ID12
Feature activation-1.29e-01

Pos logit contributions
appointed+0.205
unct+0.182
EEP+0.180
bars+0.178
chair+0.178
Neg logit contributions
 tuna-0.111
 Wally-0.100
 shrimp-0.098
 fish-0.091
 his-0.087
Tokenbel
Token ID6667
Feature activation-4.65e-02

Pos logit contributions
amas+0.215
appointed+0.205
oga+0.192
EEP+0.186
 ceiling+0.175
Neg logit contributions
 who-0.195
 like-0.183
 fish-0.178
 shrimp-0.176
 named-0.168
Tokenieve
Token ID12311
Feature activation-2.27e-02

Pos logit contributions
appointed+0.087
chair+0.083
irds+0.082
EEP+0.078
 roof+0.077
Neg logit contributions
 tuna-0.057
 Wally-0.056
 fisher-0.055
 Sebastian-0.051
 fossil-0.051
Token.
Token ID13
Feature activation-5.87e-02

Pos logit contributions
appointed+0.043
chair+0.039
unct+0.039
irds+0.037
flash+0.036
Neg logit contributions
 Wally-0.033
 tuna-0.032
 shrimp-0.031
 fisher-0.030
 Sebastian-0.029
Token
Token ID198
Feature activation-1.41e-02

Pos logit contributions
unct+0.113
appointed+0.106
arse+0.104
ong+0.101
ring+0.100
Neg logit contributions
 Wally-0.086
 tuna-0.077
 named-0.072
 Nem-0.072
 Finn-0.071
Token
Token ID198
Feature activation-4.10e-02

Pos logit contributions
appointed+0.031
chair+0.029
unct+0.029
irds+0.028
ring+0.028
Neg logit contributions
 Wally-0.016
 tuna-0.015
 Sebastian-0.014
 shrimp-0.014
 fisher-0.013
Token.
Token ID13
Feature activation-2.44e-02

Pos logit contributions
appointed+0.013
chair+0.012
unct+0.012
irds+0.011
 halls+0.011
Neg logit contributions
 Wally-0.010
 tuna-0.010
 shrimp-0.009
 Sebastian-0.009
 fisher-0.009
Token It
Token ID632
Feature activation-4.59e-02

Pos logit contributions
appointed+0.042
chair+0.039
unct+0.039
 halls+0.037
irds+0.037
Neg logit contributions
 Wally-0.039
 tuna-0.038
 Sebastian-0.036
 shrimp-0.035
 fisher-0.035
Token will
Token ID481
Feature activation-1.91e-02

Pos logit contributions
appointed+0.093
irds+0.088
iece+0.086
unct+0.085
rolling+0.084
Neg logit contributions
 tuna-0.058
 Wally-0.055
 shrimp-0.048
 Finn-0.047
 Fin-0.047
Token make
Token ID787
Feature activation-6.95e-02

Pos logit contributions
appointed+0.032
chair+0.031
irds+0.029
unct+0.029
 halls+0.028
Neg logit contributions
 Wally-0.030
 tuna-0.030
 Sebastian-0.028
 shrimp-0.028
 fisher-0.027
Token your
Token ID534
Feature activation-6.20e-02

Pos logit contributions
appointed+0.163
irds+0.155
rolling+0.150
bows+0.149
chair+0.147
Neg logit contributions
 tuna-0.080
 Wally-0.068
 his-0.059
 shrimp-0.059
 Finn-0.058
Token ear
Token ID1027
Feature activation-1.29e-01

Pos logit contributions
appointed+0.108
irds+0.100
cakes+0.094
cases+0.093
filled+0.093
Neg logit contributions
 tuna-0.102
 Wally-0.087
 fish-0.086
 shrimp-0.086
 shark-0.084
Token better
Token ID1365
Feature activation-3.21e-02

Pos logit contributions
appointed+0.245
umm+0.235
ol+0.228
 halls+0.226
umbled+0.223
Neg logit contributions
 like-0.157
 who-0.143
 and-0.142
 also-0.133
 or-0.129
Token.
Token ID13
Feature activation-8.50e-03

Pos logit contributions
appointed+0.071
chair+0.066
unct+0.065
irds+0.064
flash+0.063
Neg logit contributions
 tuna-0.034
 Wally-0.032
 shrimp-0.031
 fisher-0.030
 Sebastian-0.028
Token Do
Token ID2141
Feature activation-3.77e-02

Pos logit contributions
appointed+0.014
chair+0.013
unct+0.012
irds+0.012
 halls+0.012
Neg logit contributions
 Wally-0.014
 tuna-0.014
 Sebastian-0.013
 shrimp-0.013
 fisher-0.013
Token you
Token ID345
Feature activation-2.18e-03

Pos logit contributions
appointed+0.077
chair+0.071
ring+0.070
rolling+0.069
amas+0.068
Neg logit contributions
 Wally-0.054
 tuna-0.051
 shrimp-0.046
 Fin-0.045
 Finn-0.044
Token want
Token ID765
Feature activation-1.95e-02

Pos logit contributions
appointed+0.004
chair+0.004
unct+0.003
irds+0.003
 halls+0.003
Neg logit contributions
 Wally-0.004
 tuna-0.003
 shrimp-0.003
 Sebastian-0.003
 fisher-0.003
Token horn
Token ID12718
Feature activation-4.30e-02

Pos logit contributions
appointed+0.101
irds+0.091
ics+0.088
rolling+0.087
cases+0.087
Neg logit contributions
 tuna-0.083
 fish-0.077
 fossil-0.075
 tank-0.075
 shark-0.071
Token and
Token ID290
Feature activation-8.57e-03

Pos logit contributions
appointed+0.121
unct+0.114
irds+0.111
 halls+0.110
rolling+0.109
Neg logit contributions
 tuna-0.021
 Wally-0.019
 fisher-0.015
 shrimp-0.014
 fish-0.013
Token made
Token ID925
Feature activation-7.77e-02

Pos logit contributions
appointed+0.017
chair+0.016
unct+0.015
irds+0.015
 halls+0.014
Neg logit contributions
 Wally-0.012
 tuna-0.011
 Sebastian-0.011
 shrimp-0.011
 fisher-0.011
Token v
Token ID410
Feature activation-5.91e-02

Pos logit contributions
appointed+0.201
unct+0.181
amas+0.177
rolling+0.176
chair+0.175
Neg logit contributions
 tuna-0.070
 his-0.067
 fish-0.066
 Finn-0.062
 shrimp-0.059
Tokenroom
Token ID3823
Feature activation-8.52e-02

Pos logit contributions
appointed+0.115
unct+0.104
chair+0.101
rated+0.094
 halls+0.094
Neg logit contributions
 fisher-0.073
 chefs-0.068
 tuna-0.066
 Finn-0.065
 Wally-0.064
Token-
Token ID12
Feature activation-1.26e-01

Pos logit contributions
appointed+0.149
unct+0.127
irds+0.124
 doorstep+0.123
oga+0.122
Neg logit contributions
 tuna-0.135
 fish-0.131
 his-0.121
 fisher-0.117
 and-0.116
Tokenv
Token ID85
Feature activation-3.64e-02

Pos logit contributions
appointed+0.257
 doorstep+0.250
amas+0.247
oga+0.228
rolling+0.225
Neg logit contributions
 who-0.131
 like-0.126
 fish-0.119
He-0.105
 his-0.100
Tokenroom
Token ID3823
Feature activation-8.56e-02

Pos logit contributions
appointed+0.060
unct+0.054
chair+0.053
 halls+0.048
rolling+0.048
Neg logit contributions
 tuna-0.058
 fisher-0.057
 Wally-0.056
 Sebastian-0.056
 Finn-0.055
Token noises
Token ID26782
Feature activation-3.98e-03

Pos logit contributions
appointed+0.144
unct+0.124
irds+0.120
oga+0.116
chair+0.116
Neg logit contributions
 tuna-0.140
 fish-0.136
 his-0.124
 and-0.123
 fisher-0.122
Token.
Token ID13
Feature activation-1.51e-02

Pos logit contributions
appointed+0.008
chair+0.008
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.005
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token
Token ID198
Feature activation+1.19e-02

Pos logit contributions
appointed+0.030
chair+0.029
unct+0.028
irds+0.027
ring+0.027
Neg logit contributions
 Wally-0.020
 tuna-0.018
 Sebastian-0.018
 fisher-0.018
 shrimp-0.017
Token The
Token ID383
Feature activation-4.91e-02

Pos logit contributions
unct+0.085
chair+0.085
vation+0.083
iece+0.081
arse+0.080
Neg logit contributions
 Wally-0.075
 Fin-0.069
 shrimp-0.069
 Sebastian-0.069
 Finn-0.068
Token bright
Token ID6016
Feature activation-4.86e-02

Pos logit contributions
cases+0.082
chair+0.081
appointed+0.080
amas+0.076
unct+0.076
Neg logit contributions
 shrimp-0.082
 fish-0.078
 Wally-0.076
 tuna-0.075
 fisherman-0.070
Token colors
Token ID7577
Feature activation-1.35e-02

Pos logit contributions
appointed+0.080
cases+0.078
unct+0.076
seek+0.075
amas+0.075
Neg logit contributions
 shrimp-0.080
 fish-0.077
 tuna-0.074
 Wally-0.073
 named-0.069
Token made
Token ID925
Feature activation-5.57e-02

Pos logit contributions
appointed+0.025
chair+0.023
unct+0.022
irds+0.021
 halls+0.021
Neg logit contributions
 Wally-0.020
 tuna-0.019
 shrimp-0.019
 Sebastian-0.018
 fisher-0.018
Token Ab
Token ID2275
Feature activation-7.12e-02

Pos logit contributions
chair+0.132
appointed+0.125
bars+0.122
unct+0.120
cases+0.119
Neg logit contributions
 Wally-0.060
 shrimp-0.057
 tuna-0.054
 Nem-0.054
 Sebastian-0.053
Tokenig
Token ID328
Feature activation-1.24e-01

Pos logit contributions
appointed+0.150
chair+0.148
cases+0.138
 halls+0.136
case+0.135
Neg logit contributions
 Finn-0.070
 fisher-0.069
 tuna-0.068
 Sebastian-0.065
 shrimp-0.065
Tokenail
Token ID603
Feature activation-1.04e-01

Pos logit contributions
chair+0.224
clock+0.211
appointed+0.205
umm+0.198
 campus+0.188
Neg logit contributions
 and-0.163
 sw-0.153
 swimming-0.152
 fish-0.151
 spl-0.149
Token so
Token ID523
Feature activation-5.43e-02

Pos logit contributions
appointed+0.213
amas+0.210
aries+0.207
unct+0.204
chair+0.200
Neg logit contributions
 like-0.114
 fish-0.105
 tuna-0.105
 and-0.103
 shrimp-0.096
Token happy
Token ID3772
Feature activation-1.17e-02

Pos logit contributions
amas+0.136
aries+0.133
rays+0.132
appointed+0.130
 doorstep+0.129
Neg logit contributions
 Wally-0.045
 Finn-0.041
 shrimp-0.040
 tuna-0.038
 fish-0.038
Token.
Token ID13
Feature activation-3.91e-02

Pos logit contributions
appointed+0.025
chair+0.024
unct+0.023
irds+0.022
 halls+0.022
Neg logit contributions
 Wally-0.014
 tuna-0.014
 shrimp-0.013
 Sebastian-0.013
 fisher-0.012
Token
Token ID198
Feature activation+1.26e-03

Pos logit contributions
chair+0.063
unct+0.062
appointed+0.061
vation+0.057
ring+0.057
Neg logit contributions
 Wally-0.065
 Sebastian-0.061
 Finn-0.059
 Fin-0.059
 tuna-0.058
Token sun
Token ID4252
Feature activation-4.25e-02

Pos logit contributions
irds+0.065
appointed+0.062
owed+0.061
chair+0.060
unct+0.059
Neg logit contributions
 Sebastian-0.060
 tuna-0.059
 Wally-0.059
 shrimp-0.058
 fisher-0.057
Token would
Token ID561
Feature activation-2.12e-02

Pos logit contributions
appointed+0.092
unct+0.088
irds+0.086
 halls+0.083
chair+0.082
Neg logit contributions
 Wally-0.042
 tuna-0.040
 Sebastian-0.038
 shrimp-0.037
 named-0.037
Token make
Token ID787
Feature activation-1.04e-01

Pos logit contributions
appointed+0.031
chair+0.030
irds+0.028
unct+0.028
 halls+0.027
Neg logit contributions
 Wally-0.037
 tuna-0.036
 Sebastian-0.035
 shrimp-0.034
 fisher-0.034
Token the
Token ID262
Feature activation-4.97e-02

Pos logit contributions
appointed+0.284
irds+0.280
seek+0.273
cases+0.268
rolling+0.266
Neg logit contributions
 them-0.075
 their-0.069
 his-0.063
 it-0.050
 tuna-0.044
Token sc
Token ID629
Feature activation-9.09e-02

Pos logit contributions
irds+0.080
appointed+0.080
chair+0.077
owed+0.077
cases+0.076
Neg logit contributions
 shrimp-0.075
 tuna-0.075
 Sebastian-0.074
 fish-0.073
 Wally-0.070
Tokenooter
Token ID25141
Feature activation-1.24e-01

Pos logit contributions
 ceiling+0.121
appointed+0.119
amas+0.111
light+0.110
 endings+0.109
Neg logit contributions
 named-0.147
 fisher-0.138
 Sebastian-0.135
 fishing-0.132
 Fin-0.131
Token too
Token ID1165
Feature activation-4.10e-02

Pos logit contributions
irds+0.254
olk+0.244
appointed+0.244
umm+0.234
amas+0.233
Neg logit contributions
 like-0.150
 his-0.123
 and-0.119
 or-0.107
 too-0.107
Token hot
Token ID3024
Feature activation+8.41e-03

Pos logit contributions
appointed+0.084
irds+0.083
iece+0.082
unct+0.082
chair+0.081
Neg logit contributions
 tuna-0.044
 Wally-0.042
 shrimp-0.042
 Sebastian-0.039
 fisher-0.037
Token to
Token ID284
Feature activation-3.82e-02

Pos logit contributions
 Wally+0.013
 tuna+0.013
 fisher+0.012
 Sebastian+0.012
 shrimp+0.012
Neg logit contributions
appointed-0.014
chair-0.014
ring-0.013
unct-0.013
 halls-0.012
Token touch
Token ID3638
Feature activation-4.36e-02

Pos logit contributions
appointed+0.063
chair+0.059
irds+0.058
 halls+0.055
cases+0.055
Neg logit contributions
 Wally-0.061
 tuna-0.058
 Sebastian-0.058
 shrimp-0.055
 Otto-0.055
Token.
Token ID13
Feature activation-3.85e-02

Pos logit contributions
appointed+0.103
unct+0.096
irds+0.095
rolling+0.094
chair+0.092
Neg logit contributions
 Wally-0.043
 tuna-0.041
 shrimp-0.038
 Sebastian-0.036
 Finn-0.034
Token light
Token ID1657
Feature activation-1.82e-02

Pos logit contributions
unct+0.179
 cheeks+0.174
appointed+0.173
 forehead+0.164
erved+0.164
Neg logit contributions
 Wally-0.143
 Sebastian-0.139
 tuna-0.139
 shrimp-0.137
 fisher-0.136
Token
Token ID220
Feature activation-1.18e-01

Pos logit contributions
appointed+0.032
chair+0.029
unct+0.028
irds+0.027
 halls+0.027
Neg logit contributions
 Wally-0.029
 tuna-0.028
 shrimp-0.027
 Sebastian-0.026
 fisher-0.026
Token club
Token ID3430
Feature activation-1.62e-02

Pos logit contributions
unct+0.183
 cheeks+0.177
appointed+0.176
erved+0.169
 halls+0.168
Neg logit contributions
 tuna-0.146
 Wally-0.146
inn-0.143
 Sebastian-0.141
 shrimp-0.139
Token
Token ID220
Feature activation-1.21e-01

Pos logit contributions
appointed+0.032
chair+0.030
unct+0.029
irds+0.028
 halls+0.028
Neg logit contributions
 Wally-0.022
 tuna-0.021
 shrimp-0.020
 Sebastian-0.020
 fisher-0.019
Token spinning
Token ID19493
Feature activation-1.93e-02

Pos logit contributions
unct+0.187
 cheeks+0.182
appointed+0.178
 halls+0.176
 roof+0.175
Neg logit contributions
inn-0.152
 Wally-0.152
 tuna-0.150
 Sebastian-0.145
 Finn-0.142
Token
Token ID220
Feature activation-1.24e-01

Pos logit contributions
appointed+0.037
chair+0.035
unct+0.034
irds+0.033
 halls+0.033
Neg logit contributions
 Wally-0.026
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.023
Token in
Token ID287
Feature activation-3.97e-03

Pos logit contributions
unct+0.195
 cheeks+0.184
 halls+0.183
appointed+0.183
 roof+0.181
Neg logit contributions
 Wally-0.157
 tuna-0.157
inn-0.156
 Finn-0.148
 Sebastian-0.147
Token the
Token ID262
Feature activation-3.84e-02

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.006
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token sky
Token ID6766
Feature activation-4.68e-02

Pos logit contributions
appointed+0.050
unct+0.048
irds+0.047
chair+0.044
owed+0.043
Neg logit contributions
 tuna-0.077
 Wally-0.076
 shrimp-0.075
 fisher-0.069
 Sebastian-0.069
Token!"
Token ID2474
Feature activation-5.34e-02

Pos logit contributions
appointed+0.101
chair+0.098
unct+0.097
 halls+0.092
irds+0.092
Neg logit contributions
 Wally-0.050
 tuna-0.049
 shrimp-0.046
 fisher-0.043
 Sebastian-0.043
Token
Token ID198
Feature activation-2.56e-02

Pos logit contributions
unct+0.118
appointed+0.111
chair+0.108
flash+0.107
irds+0.107
Neg logit contributions
 named-0.050
 Wally-0.050
 Sebastian-0.047
 tuna-0.040
 shrimp-0.039
Token his
Token ID465
Feature activation-1.11e-02

Pos logit contributions
appointed+0.016
chair+0.015
unct+0.015
irds+0.014
 halls+0.014
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.006
 Sebastian-0.006
 fisher-0.006
Token red
Token ID2266
Feature activation-1.68e-02

Pos logit contributions
appointed+0.018
chair+0.017
unct+0.015
irds+0.015
 halls+0.015
Neg logit contributions
 Wally-0.019
 tuna-0.019
 shrimp-0.018
 Sebastian-0.018
 fisher-0.018
Token cape
Token ID36023
Feature activation-1.65e-02

Pos logit contributions
appointed+0.029
chair+0.027
unct+0.025
irds+0.025
 halls+0.024
Neg logit contributions
 tuna-0.027
 Wally-0.026
 shrimp-0.025
 Sebastian-0.024
 fisher-0.024
Token and
Token ID290
Feature activation-2.60e-02

Pos logit contributions
appointed+0.037
chair+0.035
unct+0.034
irds+0.033
 halls+0.033
Neg logit contributions
 Wally-0.017
 tuna-0.017
 Sebastian-0.015
 shrimp-0.015
 fisher-0.015
Token make
Token ID787
Feature activation-6.53e-02

Pos logit contributions
appointed+0.052
chair+0.049
irds+0.047
unct+0.046
amas+0.046
Neg logit contributions
 tuna-0.034
 Wally-0.033
 shrimp-0.031
 Sebastian-0.031
 fisher-0.030
Token-
Token ID12
Feature activation-1.23e-01

Pos logit contributions
appointed+0.166
chair+0.150
irds+0.146
unct+0.142
ring+0.140
Neg logit contributions
 tuna-0.063
 his-0.056
 fish-0.053
 Wally-0.052
 Sebastian-0.052
Tokenbel
Token ID6667
Feature activation-4.70e-02

Pos logit contributions
appointed+0.207
amas+0.203
 doorstep+0.186
 ceiling+0.185
umm+0.182
Neg logit contributions
 who-0.170
 fish-0.165
 like-0.162
 named-0.158
 fisher-0.153
Tokenieve
Token ID12311
Feature activation-3.53e-03

Pos logit contributions
appointed+0.088
chair+0.084
irds+0.081
EEP+0.077
unct+0.076
Neg logit contributions
 tuna-0.061
 Wally-0.059
 fisher-0.058
 Sebastian-0.055
 shrimp-0.054
Token he
Token ID339
Feature activation-4.50e-02

Pos logit contributions
appointed+0.007
chair+0.007
unct+0.006
irds+0.006
 halls+0.006
Neg logit contributions
 Wally-0.005
 tuna-0.005
 shrimp-0.004
 Sebastian-0.004
 fisher-0.004
Token had
Token ID550
Feature activation-1.61e-02

Pos logit contributions
appointed+0.095
amas+0.086
irds+0.086
chair+0.085
unct+0.083
Neg logit contributions
 tuna-0.054
 Wally-0.049
 fisher-0.048
 fossil-0.048
 shrimp-0.046
Token super
Token ID2208
Feature activation-6.43e-02

Pos logit contributions
appointed+0.032
chair+0.030
unct+0.029
irds+0.028
amas+0.027
Neg logit contributions
 Wally-0.022
 tuna-0.021
 shrimp-0.020
 Sebastian-0.020
 fisher-0.020
Token day
Token ID1110
Feature activation-2.71e-02

Pos logit contributions
 Wally+0.024
 Sebastian+0.023
 tuna+0.022
 shrimp+0.022
 fisher+0.022
Neg logit contributions
chair-0.028
appointed-0.028
 halls-0.026
unct-0.025
bows-0.024
Token driving
Token ID5059
Feature activation-3.84e-02

Pos logit contributions
appointed+0.058
chair+0.054
unct+0.052
irds+0.052
flash+0.051
Neg logit contributions
 Wally-0.033
 tuna-0.032
 shrimp-0.030
 Sebastian-0.029
 fisher-0.028
Token and
Token ID290
Feature activation-4.33e-02

Pos logit contributions
appointed+0.076
unct+0.071
chair+0.069
irds+0.068
flash+0.067
Neg logit contributions
 Wally-0.051
 tuna-0.047
 Sebastian-0.046
 shrimp-0.045
 Finn-0.043
Token making
Token ID1642
Feature activation-6.61e-02

Pos logit contributions
appointed+0.086
irds+0.081
chair+0.080
amas+0.077
unct+0.076
Neg logit contributions
 Wally-0.057
 tuna-0.053
 Sebastian-0.051
 shrimp-0.048
 Finn-0.048
Token the
Token ID262
Feature activation-4.45e-02

Pos logit contributions
appointed+0.142
chair+0.128
irds+0.128
unct+0.126
ring+0.125
Neg logit contributions
 tuna-0.085
 Wally-0.083
 shrimp-0.077
 Sebastian-0.075
 Finn-0.074
Token world
Token ID995
Feature activation-1.23e-01

Pos logit contributions
appointed+0.071
chair+0.065
irds+0.062
ring+0.061
unct+0.061
Neg logit contributions
 tuna-0.078
 Wally-0.076
 shrimp-0.075
 fisher-0.072
 Sebastian-0.071
Token pretty
Token ID2495
Feature activation-4.44e-02

Pos logit contributions
unct+0.256
appointed+0.243
irds+0.235
amas+0.235
olk+0.231
Neg logit contributions
 like-0.149
 and-0.121
 their-0.111
 tuna-0.107
 his-0.107
Token.
Token ID13
Feature activation-1.44e-02

Pos logit contributions
appointed+0.099
unct+0.090
irds+0.086
rolling+0.085
chair+0.085
Neg logit contributions
 tuna-0.053
 shrimp-0.052
 Wally-0.048
 fish-0.046
 Sebastian-0.044
Token And
Token ID843
Feature activation+7.94e-03

Pos logit contributions
appointed+0.025
chair+0.024
unct+0.023
irds+0.022
 halls+0.022
Neg logit contributions
 Wally-0.022
 tuna-0.021
 Sebastian-0.020
 shrimp-0.020
 fisher-0.020
Token from
Token ID422
Feature activation+1.35e-02

Pos logit contributions
 Wally+0.011
 tuna+0.011
 fisher+0.011
 shrimp+0.011
 Sebastian+0.011
Neg logit contributions
appointed-0.014
chair-0.014
unct-0.013
irds-0.013
 halls-0.012
Token that
Token ID326
Feature activation-1.12e-02

Pos logit contributions
 Wally+0.018
 tuna+0.017
 fisher+0.017
 Sebastian+0.017
 shrimp+0.016
Neg logit contributions
appointed-0.026
chair-0.026
unct-0.024
 halls-0.023
filled-0.022
Token the
Token ID262
Feature activation-5.54e-02

Pos logit contributions
appointed+0.054
chair+0.052
unct+0.049
 halls+0.048
cases+0.048
Neg logit contributions
 Wally-0.034
 tuna-0.032
 shrimp-0.030
 Sebastian-0.030
 Finn-0.029
Token little
Token ID1310
Feature activation-5.01e-02

Pos logit contributions
irds+0.109
appointed+0.109
ics+0.107
ring+0.104
unct+0.103
Neg logit contributions
 tuna-0.072
 shrimp-0.059
 Wally-0.058
 also-0.056
 named-0.055
Token girl
Token ID2576
Feature activation-9.02e-03

Pos logit contributions
appointed+0.097
cases+0.092
ics+0.090
unct+0.089
ong+0.089
Neg logit contributions
 tuna-0.072
 shrimp-0.069
 Wally-0.064
 fish-0.061
 fisher-0.059
Token and
Token ID290
Feature activation-3.50e-02

Pos logit contributions
appointed+0.017
chair+0.017
unct+0.016
irds+0.016
 halls+0.015
Neg logit contributions
 Wally-0.012
 tuna-0.012
 shrimp-0.011
 Sebastian-0.011
 fisher-0.011
Token her
Token ID607
Feature activation-4.79e-02

Pos logit contributions
appointed+0.070
cases+0.065
irds+0.064
amas+0.063
rolling+0.063
Neg logit contributions
 Wally-0.048
 tuna-0.045
 Sebastian-0.042
 shrimp-0.042
 named-0.040
Token arm
Token ID3211
Feature activation-1.22e-01

Pos logit contributions
appointed+0.110
unct+0.100
irds+0.099
iece+0.098
ring+0.097
Neg logit contributions
 tuna-0.053
 Wally-0.046
 shrimp-0.042
 named-0.041
 Otto-0.040
Tokenchair
Token ID16337
Feature activation-6.05e-02

Pos logit contributions
appointed+0.189
unct+0.184
vation+0.168
arse+0.163
irds+0.159
Neg logit contributions
 named-0.185
 and-0.176
 tuna-0.155
 who-0.153
 his-0.151
Token were
Token ID547
Feature activation-1.74e-02

Pos logit contributions
appointed+0.128
amas+0.115
unct+0.115
irds+0.110
chair+0.109
Neg logit contributions
 Wally-0.071
 named-0.069
 tuna-0.067
 Finn-0.063
 Fin-0.061
Token the
Token ID262
Feature activation-3.20e-02

Pos logit contributions
appointed+0.033
chair+0.033
unct+0.031
irds+0.030
 halls+0.030
Neg logit contributions
 Wally-0.024
 tuna-0.023
 shrimp-0.022
 Sebastian-0.022
 fisher-0.021
Token best
Token ID1266
Feature activation-4.15e-02

Pos logit contributions
chair+0.057
appointed+0.057
irds+0.055
 halls+0.054
unct+0.053
Neg logit contributions
 Wally-0.046
 tuna-0.046
 fisher-0.044
 shrimp-0.042
 Sebastian-0.041
Token of
Token ID286
Feature activation+7.64e-03

Pos logit contributions
appointed+0.100
unct+0.093
iece+0.086
irds+0.085
flash+0.085
Neg logit contributions
 tuna-0.043
 Wally-0.039
 Finn-0.035
 shrimp-0.035
 Otto-0.034
Token shining
Token ID22751
Feature activation-4.89e-02

Pos logit contributions
chair+0.058
appointed+0.058
irds+0.058
unct+0.056
raid+0.052
Neg logit contributions
 tuna-0.048
 Wally-0.047
 shrimp-0.045
 Sebastian-0.045
 named-0.040
Token brightly
Token ID35254
Feature activation-1.90e-02

Pos logit contributions
appointed+0.120
irds+0.119
unct+0.119
chair+0.115
seek+0.112
Neg logit contributions
 tuna-0.036
 shrimp-0.033
 Wally-0.032
 Sebastian-0.029
 like-0.028
Token,
Token ID11
Feature activation-2.20e-02

Pos logit contributions
appointed+0.043
chair+0.041
irds+0.040
unct+0.040
 halls+0.039
Neg logit contributions
 Wally-0.019
 tuna-0.019
 shrimp-0.018
 Sebastian-0.017
 fisher-0.016
Token making
Token ID1642
Feature activation-9.54e-02

Pos logit contributions
appointed+0.054
chair+0.052
unct+0.051
irds+0.051
flash+0.048
Neg logit contributions
 Wally-0.018
 tuna-0.017
 shrimp-0.016
 Sebastian-0.015
 fisher-0.013
Token the
Token ID262
Feature activation-6.86e-02

Pos logit contributions
appointed+0.248
unct+0.240
irds+0.232
ooter+0.231
bars+0.231
Neg logit contributions
 tuna-0.079
 shrimp-0.069
 their-0.068
 them-0.064
 Wally-0.056
Token world
Token ID995
Feature activation-1.22e-01

Pos logit contributions
appointed+0.116
irds+0.114
unct+0.104
ending+0.102
ring+0.102
Neg logit contributions
 tuna-0.112
 shrimp-0.108
 fish-0.102
 Wally-0.097
 shark-0.095
Token look
Token ID804
Feature activation-2.72e-02

Pos logit contributions
unct+0.279
appointed+0.265
irds+0.250
amas+0.246
EEP+0.245
Neg logit contributions
 like-0.144
 and-0.123
 tuna-0.105
 their-0.102
 fish-0.097
Token so
Token ID523
Feature activation-7.63e-02

Pos logit contributions
appointed+0.049
chair+0.048
irds+0.046
unct+0.045
oga+0.043
Neg logit contributions
 tuna-0.039
 shrimp-0.039
 Wally-0.038
 Sebastian-0.036
 fisher-0.034
Token beautiful
Token ID4950
Feature activation-6.81e-03

Pos logit contributions
amas+0.176
age+0.174
iece+0.173
oga+0.172
seek+0.172
Neg logit contributions
 shrimp-0.083
 Nem-0.070
 tuna-0.070
 inviting-0.067
 fish-0.063
Token.
Token ID13
Feature activation-7.43e-02

Pos logit contributions
appointed+0.013
chair+0.013
unct+0.012
irds+0.012
 halls+0.012
Neg logit contributions
 Wally-0.009
 tuna-0.009
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token She
Token ID1375
Feature activation-3.27e-02

Pos logit contributions
unct+0.165
vation+0.148
ring+0.146
irds+0.145
ong+0.141
Neg logit contributions
 named-0.073
 Wally-0.072
 Sebastian-0.069
 but-0.062
 like-0.061
Token from
Token ID422
Feature activation-3.91e-02

Pos logit contributions
 Wally+0.025
 tuna+0.024
 fisher+0.023
 Sebastian+0.023
 shrimp+0.023
Neg logit contributions
appointed-0.028
chair-0.026
unct-0.025
 halls-0.024
clock-0.023
Token home
Token ID1363
Feature activation-3.55e-02

Pos logit contributions
appointed+0.083
irds+0.080
amas+0.079
EEP+0.078
ring+0.077
Neg logit contributions
 Wally-0.048
 tuna-0.046
 shrimp-0.043
 Sebastian-0.040
 Finn-0.040
Token and
Token ID290
Feature activation-5.54e-02

Pos logit contributions
appointed+0.091
chair+0.087
unct+0.084
irds+0.084
flash+0.083
Neg logit contributions
 Wally-0.023
 tuna-0.022
 shrimp-0.020
 Sebastian-0.020
 fisher-0.019
Token eventually
Token ID4191
Feature activation-4.59e-02

Pos logit contributions
amas+0.132
irds+0.128
appointed+0.126
oga+0.125
akes+0.125
Neg logit contributions
 named-0.048
 Wally-0.045
 Finn-0.038
 Sebastian-0.037
 tuna-0.036
Token stumbled
Token ID24241
Feature activation-3.15e-02

Pos logit contributions
appointed+0.102
chair+0.100
amas+0.096
clock+0.096
bars+0.094
Neg logit contributions
 Wally-0.050
 shrimp-0.043
 Sebastian-0.043
 named-0.041
 tuna-0.040
Token upon
Token ID2402
Feature activation-1.21e-01

Pos logit contributions
appointed+0.057
chair+0.056
unct+0.052
irds+0.051
 halls+0.050
Neg logit contributions
 Wally-0.045
 tuna-0.042
 Sebastian-0.041
 shrimp-0.040
 Finn-0.040
Token a
Token ID257
Feature activation+4.81e-03

Pos logit contributions
irds+0.376
raid+0.365
rolling+0.360
anna+0.359
ring+0.359
Neg logit contributions
 tuna-0.024
 his-0.018
 Wally-0.016
 like-0.013
 also-0.006
Token weird
Token ID7650
Feature activation+1.58e-02

Pos logit contributions
 Wally+0.009
 tuna+0.009
 Sebastian+0.008
 fisher+0.008
 shrimp+0.008
Neg logit contributions
appointed-0.007
chair-0.006
unct-0.006
 halls-0.006
irds-0.006
Token stage
Token ID3800
Feature activation-1.49e-02

Pos logit contributions
 Wally+0.028
 Sebastian+0.026
 tuna+0.025
 fisher+0.025
 Finn+0.025
Neg logit contributions
chair-0.023
appointed-0.023
unct-0.021
 roof-0.020
 halls-0.020
Token.
Token ID13
Feature activation-7.12e-02

Pos logit contributions
appointed+0.030
chair+0.029
unct+0.028
irds+0.027
flash+0.027
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.017
 Sebastian-0.017
 fisher-0.017
Token On
Token ID1550
Feature activation-3.21e-02

Pos logit contributions
unct+0.149
iece+0.142
arse+0.138
irds+0.137
vation+0.135
Neg logit contributions
 Wally-0.092
 tuna-0.086
 Otto-0.078
 shrimp-0.076
 named-0.075
Token when
Token ID618
Feature activation+1.23e-02

Pos logit contributions
 Wally+0.044
 fisher+0.042
 tuna+0.042
 Sebastian+0.042
 shrimp+0.042
Neg logit contributions
chair-0.039
appointed-0.035
clock-0.033
irds-0.033
ring-0.032
Token he
Token ID339
Feature activation+6.21e-03

Pos logit contributions
 Wally+0.015
 tuna+0.014
 fisher+0.013
 shrimp+0.013
 Sebastian+0.013
Neg logit contributions
appointed-0.025
chair-0.024
unct-0.023
 halls-0.022
irds-0.022
Token was
Token ID373
Feature activation-2.71e-02

Pos logit contributions
 Wally+0.010
 tuna+0.009
 Sebastian+0.009
 shrimp+0.009
 fisher+0.009
Neg logit contributions
appointed-0.010
chair-0.010
unct-0.009
 halls-0.009
irds-0.009
Token playing
Token ID2712
Feature activation+2.68e-02

Pos logit contributions
chair+0.051
appointed+0.051
irds+0.048
unct+0.046
amas+0.046
Neg logit contributions
 Wally-0.039
 tuna-0.036
 Sebastian-0.035
 shrimp-0.035
 fisher-0.035
Token make
Token ID787
Feature activation-3.65e-02

Pos logit contributions
 Wally+0.037
 Sebastian+0.036
 tuna+0.035
 fisher+0.035
 shrimp+0.035
Neg logit contributions
chair-0.049
appointed-0.046
 halls-0.043
iece-0.042
amas-0.042
Token-
Token ID12
Feature activation-1.21e-01

Pos logit contributions
appointed+0.077
unct+0.070
chair+0.069
irds+0.067
flash+0.065
Neg logit contributions
 tuna-0.049
 Wally-0.049
 shrimp-0.045
 Sebastian-0.044
 Finn-0.042
Tokenbel
Token ID6667
Feature activation-4.75e-02

Pos logit contributions
amas+0.195
appointed+0.195
 ceiling+0.179
 doorstep+0.178
 spine+0.174
Neg logit contributions
 who-0.179
 named-0.169
 shrimp-0.169
 fish-0.160
 like-0.160
Tokenieve
Token ID12311
Feature activation+3.77e-03

Pos logit contributions
appointed+0.088
irds+0.084
chair+0.082
 roof+0.080
unct+0.079
Neg logit contributions
 tuna-0.060
 Wally-0.059
 fisher-0.058
 Sebastian-0.054
 shrimp-0.053
Token at
Token ID379
Feature activation-1.04e-02

Pos logit contributions
 Wally+0.005
 tuna+0.005
 Sebastian+0.005
 shrimp+0.005
 fisher+0.005
Neg logit contributions
chair-0.007
appointed-0.007
irds-0.006
unct-0.006
 halls-0.006
Token home
Token ID1363
Feature activation-2.08e-02

Pos logit contributions
appointed+0.021
chair+0.020
unct+0.019
irds+0.018
 halls+0.018
Neg logit contributions
 Wally-0.014
 tuna-0.013
 shrimp-0.013
 Sebastian-0.013
 fisher-0.012
Token,
Token ID11
Feature activation+3.09e-02

Pos logit contributions
appointed+0.044
chair+0.042
unct+0.040
irds+0.040
flash+0.039
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token a
Token ID257
Feature activation+4.09e-03

Pos logit contributions
appointed+0.107
chair+0.097
irds+0.095
unct+0.094
amas+0.093
Neg logit contributions
 tuna-0.046
 shrimp-0.045
 Wally-0.044
 Sebastian-0.041
 fish-0.039
Token
Token ID220
Feature activation-1.13e-01

Pos logit contributions
 Wally+0.008
 tuna+0.007
 Sebastian+0.007
 shrimp+0.007
 fisher+0.007
Neg logit contributions
appointed-0.006
chair-0.006
unct-0.005
irds-0.005
 halls-0.005
Token light
Token ID1657
Feature activation-1.82e-02

Pos logit contributions
unct+0.179
 cheeks+0.174
appointed+0.173
 forehead+0.164
erved+0.164
Neg logit contributions
 Wally-0.143
 Sebastian-0.139
 tuna-0.139
 shrimp-0.137
 fisher-0.136
Token
Token ID220
Feature activation-1.18e-01

Pos logit contributions
appointed+0.032
chair+0.029
unct+0.028
irds+0.027
 halls+0.027
Neg logit contributions
 Wally-0.029
 tuna-0.028
 shrimp-0.027
 Sebastian-0.026
 fisher-0.026
Token club
Token ID3430
Feature activation-1.62e-02

Pos logit contributions
unct+0.183
 cheeks+0.177
appointed+0.176
erved+0.169
 halls+0.168
Neg logit contributions
 tuna-0.146
 Wally-0.146
inn-0.143
 Sebastian-0.141
 shrimp-0.139
Token
Token ID220
Feature activation-1.21e-01

Pos logit contributions
appointed+0.032
chair+0.030
unct+0.029
irds+0.028
 halls+0.028
Neg logit contributions
 Wally-0.022
 tuna-0.021
 shrimp-0.020
 Sebastian-0.020
 fisher-0.019
Token spinning
Token ID19493
Feature activation-1.93e-02

Pos logit contributions
unct+0.187
 cheeks+0.182
appointed+0.178
 halls+0.176
 roof+0.175
Neg logit contributions
inn-0.152
 Wally-0.152
 tuna-0.150
 Sebastian-0.145
 Finn-0.142
Token
Token ID220
Feature activation-1.24e-01

Pos logit contributions
appointed+0.037
chair+0.035
unct+0.034
irds+0.033
 halls+0.033
Neg logit contributions
 Wally-0.026
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.023
Token in
Token ID287
Feature activation-3.97e-03

Pos logit contributions
unct+0.195
 cheeks+0.184
 halls+0.183
appointed+0.183
 roof+0.181
Neg logit contributions
 Wally-0.157
 tuna-0.157
inn-0.156
 Finn-0.148
 Sebastian-0.147
Token the
Token ID262
Feature activation-3.84e-02

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.006
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token sky
Token ID6766
Feature activation-4.68e-02

Pos logit contributions
appointed+0.050
unct+0.048
irds+0.047
chair+0.044
owed+0.043
Neg logit contributions
 tuna-0.077
 Wally-0.076
 shrimp-0.075
 fisher-0.069
 Sebastian-0.069
Token blanket
Token ID18447
Feature activation-1.88e-02

Pos logit contributions
appointed+0.001
chair+0.001
unct+0.001
irds+0.001
 halls+0.001
Neg logit contributions
 Wally-0.001
 tuna-0.001
 Sebastian-0.001
 shrimp-0.001
 fisher-0.001
Token before
Token ID878
Feature activation-2.22e-02

Pos logit contributions
appointed+0.038
chair+0.036
unct+0.035
irds+0.034
 halls+0.034
Neg logit contributions
 Wally-0.024
 tuna-0.023
 shrimp-0.022
 Sebastian-0.022
 fisher-0.021
Token it
Token ID340
Feature activation-5.74e-02

Pos logit contributions
appointed+0.043
chair+0.041
irds+0.040
unct+0.040
flash+0.038
Neg logit contributions
 Wally-0.030
 tuna-0.030
 shrimp-0.028
 Sebastian-0.027
 fisher-0.025
Token was
Token ID373
Feature activation-1.58e-03

Pos logit contributions
appointed+0.114
irds+0.110
unct+0.108
chair+0.104
iece+0.102
Neg logit contributions
 Wally-0.069
 tuna-0.069
 named-0.060
 shrimp-0.060
 Finn-0.060
Token torn
Token ID12445
Feature activation-1.80e-02

Pos logit contributions
appointed+0.003
chair+0.003
unct+0.002
irds+0.002
 halls+0.002
Neg logit contributions
 Wally-0.003
 tuna-0.002
 shrimp-0.002
 Sebastian-0.002
 fisher-0.002
Token
Token ID220
Feature activation-1.20e-01

Pos logit contributions
appointed+0.037
chair+0.035
unct+0.034
irds+0.034
 halls+0.033
Neg logit contributions
 Wally-0.022
 tuna-0.021
 Sebastian-0.020
 shrimp-0.020
 fisher-0.019
Token
Token ID198
Feature activation-2.37e-02

Pos logit contributions
 halls+0.177
unct+0.176
chair+0.172
 floors+0.165
 streets+0.158
Neg logit contributions
inn-0.167
 Wally-0.161
 tuna-0.160
 Sebastian-0.152
 shrimp-0.143
Token
Token ID198
Feature activation-8.86e-03

Pos logit contributions
appointed+0.053
unct+0.051
chair+0.050
 halls+0.049
irds+0.049
Neg logit contributions
 Wally-0.025
 tuna-0.023
 Sebastian-0.022
 shrimp-0.021
 Finn-0.020
TokenS
Token ID50
Feature activation-3.97e-02

Pos logit contributions
appointed+0.014
chair+0.013
unct+0.012
irds+0.012
 halls+0.012
Neg logit contributions
 Wally-0.015
 tuna-0.015
 Sebastian-0.014
 shrimp-0.014
 fisher-0.014
Tokenue
Token ID518
Feature activation-3.25e-02

Pos logit contributions
chair+0.074
appointed+0.073
 halls+0.069
 ceiling+0.066
 roof+0.065
Neg logit contributions
 Wally-0.056
 tuna-0.053
 shrimp-0.053
 Sebastian-0.051
 fisher-0.049
Token decided
Token ID3066
Feature activation-1.16e-02

Pos logit contributions
appointed+0.062
chair+0.058
irds+0.057
amas+0.056
 halls+0.056
Neg logit contributions
 Wally-0.043
 tuna-0.042
 shrimp-0.039
 Sebastian-0.039
 named-0.038
Token way
Token ID835
Feature activation-4.44e-02

Pos logit contributions
appointed+0.010
chair+0.010
unct+0.009
irds+0.009
 halls+0.009
Neg logit contributions
 Wally-0.012
 tuna-0.012
 shrimp-0.011
 Sebastian-0.011
 fisher-0.011
Token.
Token ID13
Feature activation-3.42e-02

Pos logit contributions
appointed+0.106
unct+0.101
chair+0.099
amas+0.096
flash+0.096
Neg logit contributions
 Wally-0.035
 Sebastian-0.031
 tuna-0.031
 shrimp-0.030
 named-0.029
Token This
Token ID770
Feature activation-7.25e-03

Pos logit contributions
unct+0.067
appointed+0.065
irds+0.063
chair+0.061
ring+0.061
Neg logit contributions
 Wally-0.046
 tuna-0.045
 shrimp-0.041
 Finn-0.040
 Sebastian-0.039
Token made
Token ID925
Feature activation-6.58e-02

Pos logit contributions
appointed+0.014
chair+0.013
unct+0.012
irds+0.012
 halls+0.012
Neg logit contributions
 Wally-0.010
 tuna-0.010
 shrimp-0.010
 Sebastian-0.010
 fisher-0.009
Token the
Token ID262
Feature activation-4.93e-02

Pos logit contributions
appointed+0.186
unct+0.179
chair+0.177
irds+0.175
age+0.174
Neg logit contributions
 tuna-0.045
 Wally-0.040
 Finn-0.033
 shrimp-0.029
 Nem-0.024
Token dress
Token ID6576
Feature activation-1.20e-01

Pos logit contributions
appointed+0.092
irds+0.089
cases+0.083
chair+0.083
rolling+0.083
Neg logit contributions
 tuna-0.067
 shrimp-0.066
 Wally-0.058
 named-0.057
 like-0.055
Token look
Token ID804
Feature activation-2.66e-02

Pos logit contributions
appointed+0.260
unct+0.260
anna+0.246
irds+0.245
chair+0.241
Neg logit contributions
 like-0.131
 and-0.115
 but-0.091
 just-0.090
 also-0.088
Token different
Token ID1180
Feature activation-1.68e-02

Pos logit contributions
appointed+0.050
chair+0.049
irds+0.047
unct+0.047
amas+0.044
Neg logit contributions
 tuna-0.036
 Wally-0.035
 shrimp-0.034
 Sebastian-0.033
 fisher-0.031
Token,
Token ID11
Feature activation+1.60e-02

Pos logit contributions
appointed+0.038
chair+0.037
unct+0.035
irds+0.035
 halls+0.034
Neg logit contributions
 Wally-0.017
 tuna-0.017
 shrimp-0.016
 Sebastian-0.015
 fisher-0.015
Token with
Token ID351
Feature activation-1.52e-02

Pos logit contributions
 Wally+0.020
 fisher+0.019
 Sebastian+0.018
 tuna+0.018
 shrimp+0.017
Neg logit contributions
appointed-0.033
chair-0.032
 halls-0.029
bows-0.029
 roof-0.029
Token a
Token ID257
Feature activation-1.32e-02

Pos logit contributions
appointed+0.026
chair+0.025
unct+0.023
irds+0.023
amas+0.022
Neg logit contributions
 Wally-0.024
 tuna-0.024
 shrimp-0.023
 Sebastian-0.022
 fisher-0.022
Token louder
Token ID27089
Feature activation-2.69e-02

Pos logit contributions
appointed+0.023
chair+0.023
irds+0.021
unct+0.021
amas+0.021
Neg logit contributions
 Wally-0.023
 tuna-0.023
 shrimp-0.022
 Sebastian-0.021
 fisher-0.021
Token.
Token ID13
Feature activation-2.01e-02

Pos logit contributions
appointed+0.061
chair+0.058
unct+0.056
irds+0.055
amas+0.054
Neg logit contributions
 Wally-0.027
 tuna-0.027
 fisher-0.025
 Sebastian-0.025
 shrimp-0.025
Token Suddenly
Token ID24975
Feature activation-1.53e-02

Pos logit contributions
appointed+0.037
unct+0.036
chair+0.035
ring+0.034
irds+0.034
Neg logit contributions
 Wally-0.030
 tuna-0.029
 Sebastian-0.027
 Finn-0.027
 Fin-0.027
Token,
Token ID11
Feature activation-1.77e-02

Pos logit contributions
appointed+0.034
chair+0.033
unct+0.031
irds+0.031
 halls+0.030
Neg logit contributions
 Wally-0.017
 tuna-0.016
 shrimp-0.015
 Sebastian-0.015
 fisher-0.015
Token the
Token ID262
Feature activation-8.78e-02

Pos logit contributions
appointed+0.039
chair+0.037
irds+0.035
unct+0.035
ring+0.035
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.019
 Sebastian-0.019
 fisher-0.018
Token ringing
Token ID32333
Feature activation-1.20e-01

Pos logit contributions
ics+0.196
irds+0.193
ance+0.191
owed+0.186
bows+0.184
Neg logit contributions
 also-0.075
 and-0.073
 tuna-0.073
 who-0.072
 named-0.067
Token stopped
Token ID5025
Feature activation-1.18e-02

Pos logit contributions
unct+0.238
appointed+0.228
rolling+0.216
irds+0.213
amas+0.211
Neg logit contributions
 named-0.125
 and-0.123
 his-0.114
 made-0.113
 who-0.109
Token and
Token ID290
Feature activation-1.95e-02

Pos logit contributions
appointed+0.026
chair+0.026
irds+0.024
unct+0.024
 halls+0.024
Neg logit contributions
 Wally-0.013
 tuna-0.012
 shrimp-0.011
 Sebastian-0.011
 fisher-0.011
Token a
Token ID257
Feature activation-1.83e-02

Pos logit contributions
appointed+0.043
chair+0.041
irds+0.039
amas+0.038
 halls+0.038
Neg logit contributions
 Wally-0.023
 tuna-0.022
 shrimp-0.021
 Sebastian-0.021
 Finn-0.020
Token funny
Token ID8258
Feature activation-5.53e-02

Pos logit contributions
appointed+0.028
chair+0.026
unct+0.025
irds+0.024
ring+0.024
Neg logit contributions
 Wally-0.033
 tuna-0.032
 shrimp-0.031
 fisher-0.031
 Sebastian-0.030
Token voice
Token ID3809
Feature activation-9.45e-03

Pos logit contributions
appointed+0.099
cases+0.087
unct+0.086
irds+0.085
bars+0.085
Neg logit contributions
 tuna-0.091
 shrimp-0.086
 Wally-0.084
 shark-0.082
 fisher-0.081
Token song
Token ID3496
Feature activation-5.73e-02

Pos logit contributions
appointed+0.142
unct+0.135
bars+0.128
rolling+0.127
cases+0.127
Neg logit contributions
 fish-0.115
 tuna-0.112
 shrimp-0.111
 and-0.110
 named-0.102
Token and
Token ID290
Feature activation-3.38e-02

Pos logit contributions
appointed+0.145
unct+0.140
 halls+0.128
amas+0.127
rolling+0.126
Neg logit contributions
 Wally-0.047
 tuna-0.045
 shrimp-0.038
 Finn-0.035
 like-0.034
Token make
Token ID787
Feature activation-9.60e-02

Pos logit contributions
appointed+0.058
chair+0.054
amas+0.053
unct+0.051
cases+0.050
Neg logit contributions
 Wally-0.056
 tuna-0.053
 shrimp-0.052
 fisher-0.049
 Sebastian-0.049
Token the
Token ID262
Feature activation-6.64e-02

Pos logit contributions
appointed+0.244
bars+0.224
chair+0.222
unct+0.215
cases+0.214
Neg logit contributions
 tuna-0.096
 fish-0.089
 Wally-0.086
 shrimp-0.086
 their-0.083
Token toy
Token ID13373
Feature activation-8.86e-02

Pos logit contributions
cases+0.132
appointed+0.127
bars+0.125
chair+0.119
unct+0.115
Neg logit contributions
 fish-0.096
 tuna-0.096
 shrimp-0.094
 Wally-0.084
 fisher-0.084
Token chain
Token ID6333
Feature activation-1.20e-01

Pos logit contributions
appointed+0.184
unct+0.154
amas+0.149
bars+0.149
 halls+0.143
Neg logit contributions
 fish-0.124
 tuna-0.112
 named-0.107
 like-0.107
 shrimp-0.106
Token dance
Token ID9280
Feature activation-5.03e-02

Pos logit contributions
appointed+0.257
unct+0.250
amas+0.233
olk+0.232
chair+0.228
Neg logit contributions
 like-0.140
 and-0.122
 fish-0.110
 his-0.107
 tuna-0.106
Token in
Token ID287
Feature activation+4.20e-03

Pos logit contributions
appointed+0.123
unct+0.117
chair+0.108
bars+0.106
amas+0.105
Neg logit contributions
 tuna-0.047
 Wally-0.046
 shrimp-0.042
 like-0.037
 Finn-0.037
Token the
Token ID262
Feature activation-5.30e-02

Pos logit contributions
 Wally+0.006
 tuna+0.006
 Sebastian+0.006
 fisher+0.006
 shrimp+0.006
Neg logit contributions
appointed-0.008
chair-0.007
unct-0.007
 halls-0.007
irds-0.007
Token air
Token ID1633
Feature activation-2.33e-02

Pos logit contributions
cases+0.091
appointed+0.087
irds+0.086
owed+0.082
unct+0.080
Neg logit contributions
 tuna-0.092
 shrimp-0.090
 fish-0.083
 Wally-0.083
 named-0.078
Token.
Token ID13
Feature activation-3.56e-02

Pos logit contributions
appointed+0.049
chair+0.047
unct+0.045
irds+0.044
 halls+0.044
Neg logit contributions
 Wally-0.028
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.025
Token.
Token ID13
Feature activation-2.33e-02

Pos logit contributions
appointed+0.062
chair+0.061
unct+0.057
irds+0.057
oga+0.056
Neg logit contributions
 Wally-0.033
 tuna-0.033
 shrimp-0.030
 fisher-0.030
 Sebastian-0.029
Token She
Token ID1375
Feature activation-1.44e-02

Pos logit contributions
appointed+0.042
unct+0.039
chair+0.039
irds+0.038
 halls+0.037
Neg logit contributions
 Wally-0.035
 tuna-0.033
 Sebastian-0.031
 Finn-0.031
 shrimp-0.031
Token will
Token ID481
Feature activation+1.64e-02

Pos logit contributions
appointed+0.027
chair+0.025
unct+0.024
irds+0.024
 halls+0.023
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.019
 Sebastian-0.019
 fisher-0.019
Token make
Token ID787
Feature activation-5.40e-02

Pos logit contributions
 Wally+0.032
 tuna+0.032
 shrimp+0.032
 fisher+0.032
 Sebastian+0.031
Neg logit contributions
appointed-0.021
chair-0.018
unct-0.018
 roof-0.017
 halls-0.016
Token your
Token ID534
Feature activation-6.89e-02

Pos logit contributions
appointed+0.127
irds+0.119
chair+0.116
bows+0.113
rolling+0.113
Neg logit contributions
 tuna-0.067
 Wally-0.065
 shrimp-0.058
 Finn-0.057
 Nem-0.052
Token arm
Token ID3211
Feature activation-1.18e-01

Pos logit contributions
appointed+0.130
cases+0.127
irds+0.124
bows+0.123
cakes+0.123
Neg logit contributions
 tuna-0.090
 Wally-0.078
 fish-0.077
 named-0.074
 baseball-0.074
Token better
Token ID1365
Feature activation-3.43e-02

Pos logit contributions
irds+0.270
uttered+0.251
rolling+0.250
appointed+0.249
AM+0.247
Neg logit contributions
 and-0.106
 like-0.104
 too-0.094
 tuna-0.086
 who-0.082
Token.
Token ID13
Feature activation-2.16e-02

Pos logit contributions
appointed+0.081
chair+0.076
rolling+0.076
unct+0.074
cases+0.074
Neg logit contributions
 Wally-0.031
 tuna-0.031
 fisher-0.028
 shrimp-0.028
 fish-0.026
Token Come
Token ID7911
Feature activation+1.28e-02

Pos logit contributions
appointed+0.038
chair+0.035
unct+0.035
irds+0.034
 halls+0.033
Neg logit contributions
 Wally-0.034
 tuna-0.032
 Sebastian-0.030
 shrimp-0.030
 Finn-0.030
Token on
Token ID319
Feature activation-1.60e-02

Pos logit contributions
 Wally+0.019
 tuna+0.019
 shrimp+0.019
 Sebastian+0.018
 fisher+0.018
Neg logit contributions
appointed-0.022
chair-0.021
irds-0.019
amas-0.019
ring-0.019
Token,
Token ID11
Feature activation-1.06e-02

Pos logit contributions
appointed+0.035
chair+0.033
unct+0.032
irds+0.032
 halls+0.031
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.017
 Sebastian-0.017
 Finn-0.016
Token him
Token ID683
Feature activation-2.94e-02

Pos logit contributions
 Wally+0.000
 tuna+0.000
 Sebastian+0.000
 shrimp+0.000
 fisher+0.000
Neg logit contributions
appointed-0.001
chair-0.001
unct-0.001
irds-0.001
 halls-0.001
Token.
Token ID13
Feature activation+2.85e-02

Pos logit contributions
appointed+0.063
chair+0.059
unct+0.058
amas+0.055
irds+0.055
Neg logit contributions
 tuna-0.036
 Wally-0.035
 shrimp-0.033
 fisher-0.031
 Sebastian-0.030
Token He
Token ID679
Feature activation+3.24e-02

Pos logit contributions
 shrimp+0.046
 Wally+0.045
 tuna+0.044
 fisher+0.043
 Sebastian+0.042
Neg logit contributions
chair-0.044
appointed-0.044
clock-0.041
irds-0.039
 Lightning-0.039
Token is
Token ID318
Feature activation+5.28e-02

Pos logit contributions
 Wally+0.058
 Sebastian+0.057
 shrimp+0.055
 fisher+0.054
 tuna+0.054
Neg logit contributions
chair-0.045
appointed-0.044
unct-0.041
 halls-0.039
cases-0.039
Token not
Token ID407
Feature activation+1.39e-02

Pos logit contributions
 Wally+0.100
 shrimp+0.098
 Sebastian+0.098
 fisher+0.096
 Finn+0.096
Neg logit contributions
appointed-0.057
clock-0.049
chair-0.046
 Lightning-0.045
 biggest-0.042
Token a
Token ID257
Feature activation+5.09e-02

Pos logit contributions
 Wally+0.022
 tuna+0.021
 Sebastian+0.021
 shrimp+0.021
 fisher+0.020
Neg logit contributions
appointed-0.023
chair-0.021
unct-0.020
ring-0.019
clock-0.019
Token bad
Token ID2089
Feature activation+2.33e-02

Pos logit contributions
 Wally+0.102
 Sebastian+0.101
 Finn+0.096
 tuna+0.094
 shrimp+0.092
Neg logit contributions
 roof-0.055
chair-0.048
 ceiling-0.046
 whole-0.046
 job-0.044
Token dog
Token ID3290
Feature activation+3.15e-02

Pos logit contributions
 Wally+0.044
 tuna+0.041
 Sebastian+0.041
 shrimp+0.040
 Finn+0.039
Neg logit contributions
chair-0.030
appointed-0.029
ness-0.027
bye-0.027
bows-0.027
Token.
Token ID13
Feature activation+2.79e-02

Pos logit contributions
 Sebastian+0.048
 Wally+0.047
 shrimp+0.046
 fisher+0.046
 tuna+0.045
Neg logit contributions
chair-0.054
appointed-0.048
ring-0.045
bows-0.044
 halls-0.044
Token He
Token ID679
Feature activation+3.20e-02

Pos logit contributions
 shrimp+0.045
 Wally+0.044
 tuna+0.043
 fisher+0.042
 Sebastian+0.042
Neg logit contributions
chair-0.045
appointed-0.043
clock-0.040
 Lightning-0.039
flash-0.037
Token is
Token ID318
Feature activation+5.54e-02

Pos logit contributions
 Wally+0.057
 Sebastian+0.056
 shrimp+0.055
 fisher+0.054
 tuna+0.053
Neg logit contributions
chair-0.045
appointed-0.045
unct-0.040
clock-0.039
 halls-0.038
Token and
Token ID290
Feature activation-3.79e-03

Pos logit contributions
 Wally+0.051
 tuna+0.048
 fisher+0.048
 Sebastian+0.047
 shrimp+0.047
Neg logit contributions
appointed-0.052
chair-0.050
clock-0.046
amas-0.045
unct-0.045
Token un
Token ID555
Feature activation-7.83e-03

Pos logit contributions
appointed+0.006
chair+0.006
unct+0.006
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.006
 Sebastian-0.006
 fisher-0.006
Tokenpack
Token ID8002
Feature activation+1.19e-02

Pos logit contributions
appointed+0.012
chair+0.012
unct+0.011
irds+0.011
 halls+0.010
Neg logit contributions
 Wally-0.014
 tuna-0.013
 Sebastian-0.013
 shrimp-0.013
 fisher-0.013
Token his
Token ID465
Feature activation+5.05e-03

Pos logit contributions
 Wally+0.017
 tuna+0.016
 fisher+0.016
 shrimp+0.015
 Sebastian+0.015
Neg logit contributions
appointed-0.022
chair-0.022
unct-0.020
 halls-0.020
cases-0.020
Token favourite
Token ID12507
Feature activation+1.73e-02

Pos logit contributions
 Wally+0.010
 Sebastian+0.009
 tuna+0.009
 fisher+0.009
 shrimp+0.009
Neg logit contributions
appointed-0.007
chair-0.007
unct-0.006
 halls-0.006
clock-0.006
Token food
Token ID2057
Feature activation+5.30e-02

Pos logit contributions
 Wally+0.035
 Sebastian+0.033
 fisher+0.032
 Fin+0.032
 Finn+0.031
Neg logit contributions
chair-0.022
appointed-0.021
 halls-0.020
amas-0.018
clock-0.018
Token.
Token ID13
Feature activation-1.32e-03

Pos logit contributions
 Wally+0.096
 fisher+0.093
 Sebastian+0.090
 shrimp+0.088
 tuna+0.087
Neg logit contributions
chair-0.075
cases-0.068
appointed-0.066
clock-0.062
 ceiling-0.062
Token He
Token ID679
Feature activation+1.30e-02

Pos logit contributions
appointed+0.002
chair+0.002
unct+0.002
irds+0.002
 halls+0.002
Neg logit contributions
 Wally-0.002
 tuna-0.002
 shrimp-0.002
 Sebastian-0.002
 fisher-0.002
Token was
Token ID373
Feature activation+9.62e-03

Pos logit contributions
 Wally+0.022
 tuna+0.021
 Sebastian+0.021
 shrimp+0.021
 fisher+0.020
Neg logit contributions
appointed-0.020
chair-0.020
 halls-0.019
unct-0.018
cases-0.018
Token delighted
Token ID20707
Feature activation+7.66e-03

Pos logit contributions
 Wally+0.016
 tuna+0.015
 shrimp+0.015
 fisher+0.015
 Sebastian+0.014
Neg logit contributions
appointed-0.016
chair-0.015
 halls-0.014
unct-0.014
clock-0.014
Token to
Token ID284
Feature activation+4.00e-03

Pos logit contributions
 Wally+0.012
 tuna+0.011
 Sebastian+0.011
 fisher+0.011
 shrimp+0.011
Neg logit contributions
appointed-0.013
chair-0.013
irds-0.012
unct-0.012
clock-0.012
Token and
Token ID290
Feature activation-5.71e-03

Pos logit contributions
 Wally+0.021
 tuna+0.020
 fisher+0.020
 fossil+0.020
 Sebastian+0.020
Neg logit contributions
chair-0.029
appointed-0.028
ring-0.025
irds-0.025
unct-0.025
Token he
Token ID339
Feature activation-1.23e-02

Pos logit contributions
appointed+0.010
chair+0.010
unct+0.009
irds+0.009
 halls+0.009
Neg logit contributions
 Wally-0.009
 tuna-0.008
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token asked
Token ID1965
Feature activation+1.18e-02

Pos logit contributions
appointed+0.021
chair+0.020
unct+0.019
irds+0.019
 halls+0.018
Neg logit contributions
 Wally-0.019
 tuna-0.019
 shrimp-0.018
 Sebastian-0.018
 fisher-0.018
Token his
Token ID465
Feature activation-9.90e-04

Pos logit contributions
 Wally+0.014
 tuna+0.013
 shrimp+0.013
 fisher+0.013
 Sebastian+0.012
Neg logit contributions
appointed-0.024
chair-0.023
unct-0.022
irds-0.021
 halls-0.021
Token brother
Token ID3956
Feature activation+2.01e-02

Pos logit contributions
appointed+0.001
chair+0.001
unct+0.001
irds+0.001
 halls+0.001
Neg logit contributions
 Wally-0.002
 tuna-0.002
 Sebastian-0.002
 shrimp-0.002
 fisher-0.002
Token,
Token ID11
Feature activation+5.46e-02

Pos logit contributions
 Wally+0.028
 tuna+0.027
 fisher+0.027
 shrimp+0.027
 Sebastian+0.027
Neg logit contributions
appointed-0.035
chair-0.034
unct-0.032
irds-0.031
cases-0.031
Token "
Token ID366
Feature activation+2.66e-02

Pos logit contributions
 fisher+0.092
 fossil+0.092
 tuna+0.091
 shrimp+0.091
 Wally+0.088
Neg logit contributions
appointed-0.069
 Lightning-0.063
chair-0.062
ring-0.059
rolling-0.059
TokenWhat
Token ID2061
Feature activation-2.61e-02

Pos logit contributions
 Wally+0.041
 tuna+0.040
 shrimp+0.038
 fisher+0.037
 Sebastian+0.036
Neg logit contributions
appointed-0.044
chair-0.042
unct-0.041
amas-0.040
ring-0.039
Token is
Token ID318
Feature activation+1.11e-02

Pos logit contributions
appointed+0.047
unct+0.042
chair+0.041
amas+0.039
irds+0.039
Neg logit contributions
 Wally-0.042
 tuna-0.041
 shrimp-0.039
 fisher-0.038
 fossil-0.037
Token that
Token ID326
Feature activation-2.91e-02

Pos logit contributions
 Wally+0.016
 fisher+0.016
 tuna+0.016
 Sebastian+0.015
 shrimp+0.015
Neg logit contributions
appointed-0.020
chair-0.019
unct-0.018
flash-0.017
 halls-0.017
Token?"
Token ID1701
Feature activation+1.64e-02

Pos logit contributions
appointed+0.055
chair+0.053
irds+0.051
unct+0.051
 halls+0.050
Neg logit contributions
 tuna-0.040
 Wally-0.040
 shrimp-0.038
 Sebastian-0.037
 fisher-0.037
Token so
Token ID523
Feature activation-1.58e-02

Pos logit contributions
 Wally+0.024
 fisher+0.022
 Sebastian+0.022
 tuna+0.022
 shrimp+0.021
Neg logit contributions
chair-0.037
appointed-0.037
 halls-0.034
 roof-0.033
clock-0.033
Token he
Token ID339
Feature activation+1.45e-02

Pos logit contributions
appointed+0.040
chair+0.038
unct+0.037
ring+0.037
irds+0.037
Neg logit contributions
 Wally-0.013
 tuna-0.012
 Sebastian-0.011
 shrimp-0.011
 fisher-0.010
Token asked
Token ID1965
Feature activation+2.43e-02

Pos logit contributions
 Wally+0.024
 tuna+0.022
 Sebastian+0.022
 shrimp+0.021
 fisher+0.021
Neg logit contributions
appointed-0.023
chair-0.023
 halls-0.021
unct-0.021
flash-0.021
Token his
Token ID465
Feature activation+5.94e-03

Pos logit contributions
 Wally+0.029
 shrimp+0.028
 tuna+0.028
 fisher+0.027
 Sebastian+0.026
Neg logit contributions
chair-0.045
appointed-0.044
 halls-0.042
flash-0.041
irds-0.040
Token friend
Token ID1545
Feature activation+2.96e-02

Pos logit contributions
 Wally+0.012
 Sebastian+0.011
 tuna+0.011
 shrimp+0.011
 fisher+0.011
Neg logit contributions
appointed-0.007
chair-0.007
irds-0.006
unct-0.006
 halls-0.006
Token,
Token ID11
Feature activation+4.43e-02

Pos logit contributions
 Wally+0.045
 fisher+0.044
 fossil+0.044
 tuna+0.044
 Sebastian+0.044
Neg logit contributions
chair-0.044
appointed-0.042
irds-0.039
up-0.038
 halls-0.038
Token Sam
Token ID3409
Feature activation+4.05e-02

Pos logit contributions
 fossil+0.073
 fisher+0.070
 tuna+0.069
 shrimp+0.068
 Wally+0.068
Neg logit contributions
appointed-0.053
chair-0.053
 Lightning-0.053
irds-0.050
 Her-0.049
Token,
Token ID11
Feature activation+4.13e-02

Pos logit contributions
 Sebastian+0.074
 fisher+0.073
 tuna+0.072
 fossil+0.072
 Wally+0.071
Neg logit contributions
chair-0.047
appointed-0.045
unct-0.045
ring-0.043
ail-0.043
Token for
Token ID329
Feature activation+1.90e-03

Pos logit contributions
 fisher+0.085
 Wally+0.083
 tuna+0.083
 shrimp+0.082
 Sebastian+0.082
Neg logit contributions
chair-0.043
appointed-0.036
ous-0.035
ring-0.034
 Lightning-0.033
Token help
Token ID1037
Feature activation+1.57e-02

Pos logit contributions
 Wally+0.003
 tuna+0.003
 Sebastian+0.003
 shrimp+0.003
 fisher+0.003
Neg logit contributions
appointed-0.003
chair-0.003
unct-0.003
 halls-0.003
irds-0.003
Token.
Token ID13
Feature activation+4.28e-03

Pos logit contributions
 Wally+0.027
 tuna+0.026
 shrimp+0.026
 fisher+0.026
 Sebastian+0.025
Neg logit contributions
appointed-0.024
chair-0.023
ring-0.021
unct-0.021
 halls-0.020
Token up
Token ID510
Feature activation-2.02e-02

Pos logit contributions
appointed+0.087
chair+0.086
irds+0.081
ring+0.079
 halls+0.077
Neg logit contributions
 tuna-0.056
 Wally-0.054
 shrimp-0.052
 Sebastian-0.048
 fisher-0.047
Token.
Token ID13
Feature activation+8.26e-03

Pos logit contributions
appointed+0.044
chair+0.042
unct+0.040
irds+0.040
amas+0.038
Neg logit contributions
 Wally-0.024
 tuna-0.023
 shrimp-0.022
 fisher-0.021
 Sebastian-0.021
Token They
Token ID1119
Feature activation+2.02e-02

Pos logit contributions
 Wally+0.013
 tuna+0.013
 shrimp+0.012
 fisher+0.012
 Sebastian+0.012
Neg logit contributions
appointed-0.014
chair-0.013
clock-0.012
 halls-0.012
irds-0.012
Token decide
Token ID5409
Feature activation+1.58e-02

Pos logit contributions
 Wally+0.037
 Sebastian+0.036
 tuna+0.036
 shrimp+0.035
 fisher+0.035
Neg logit contributions
appointed-0.028
chair-0.026
unct-0.025
 halls-0.023
 roof-0.023
Token to
Token ID284
Feature activation+2.95e-02

Pos logit contributions
 Wally+0.032
 tuna+0.030
 Sebastian+0.030
 fisher+0.030
 Finn+0.030
Neg logit contributions
appointed-0.020
chair-0.019
 halls-0.017
irds-0.017
unct-0.016
Token share
Token ID2648
Feature activation+4.58e-02

Pos logit contributions
 Wally+0.056
 tuna+0.054
 shrimp+0.054
 Sebastian+0.053
 fisher+0.052
Neg logit contributions
appointed-0.039
chair-0.034
 roof-0.033
unct-0.032
 halls-0.032
Token the
Token ID262
Feature activation+8.44e-03

Pos logit contributions
 Wally+0.077
 Sebastian+0.077
 Finn+0.077
 Otto+0.075
 fisher+0.075
Neg logit contributions
 walls-0.054
 halls-0.054
 ABC-0.053
chair-0.053
bows-0.053
Token car
Token ID1097
Feature activation-7.91e-03

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 tuna+0.016
 fisher+0.015
 shrimp+0.015
Neg logit contributions
appointed-0.011
chair-0.011
 halls-0.009
unct-0.009
 roof-0.009
Token.
Token ID13
Feature activation+6.69e-03

Pos logit contributions
appointed+0.017
chair+0.016
unct+0.016
irds+0.015
 halls+0.015
Neg logit contributions
 Wally-0.009
 tuna-0.009
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token They
Token ID1119
Feature activation+2.39e-02

Pos logit contributions
 Wally+0.010
 tuna+0.010
 shrimp+0.010
 fisher+0.010
 Sebastian+0.010
Neg logit contributions
appointed-0.012
chair-0.011
clock-0.010
irds-0.010
 halls-0.010
Token take
Token ID1011
Feature activation-2.46e-02

Pos logit contributions
 Wally+0.045
 Sebastian+0.044
 tuna+0.044
 shrimp+0.043
 fisher+0.042
Neg logit contributions
appointed-0.031
chair-0.030
unct-0.028
 halls-0.027
 roof-0.027
Token"
Token ID1
Feature activation+2.71e-02

Pos logit contributions
 Wally+0.036
 Sebastian+0.036
 tuna+0.035
 shrimp+0.034
 fossil+0.034
Neg logit contributions
appointed-0.027
chair-0.025
irds-0.023
 halls-0.023
unct-0.022
TokenCan
Token ID6090
Feature activation-1.50e-02

Pos logit contributions
 tuna+0.046
 Wally+0.046
 Sebastian+0.044
 fossil+0.044
 shrimp+0.043
Neg logit contributions
appointed-0.037
chair-0.036
irds-0.033
bye-0.032
ring-0.032
Token we
Token ID356
Feature activation+1.84e-02

Pos logit contributions
appointed+0.027
chair+0.025
unct+0.024
irds+0.023
 halls+0.023
Neg logit contributions
 Wally-0.024
 tuna-0.023
 shrimp-0.022
 Sebastian-0.021
 Finn-0.021
Token pet
Token ID4273
Feature activation+3.61e-03

Pos logit contributions
 Wally+0.034
 tuna+0.034
 fisher+0.033
 fossil+0.033
 shrimp+0.033
Neg logit contributions
appointed-0.024
chair-0.023
unct-0.021
ring-0.020
irds-0.020
Token him
Token ID683
Feature activation+1.53e-02

Pos logit contributions
 Wally+0.005
 tuna+0.005
 fisher+0.005
 Sebastian+0.005
 shrimp+0.005
Neg logit contributions
appointed-0.007
chair-0.006
unct-0.006
irds-0.006
 halls-0.006
Token?"
Token ID1701
Feature activation+1.36e-02

Pos logit contributions
 Wally+0.017
 tuna+0.016
 fisher+0.016
 shrimp+0.016
 Sebastian+0.016
Neg logit contributions
appointed-0.031
chair-0.031
unct-0.029
irds-0.028
ring-0.028
Token Ben
Token ID3932
Feature activation+3.56e-02

Pos logit contributions
 Wally+0.015
 fossil+0.015
 tuna+0.015
 fisher+0.014
 Sebastian+0.014
Neg logit contributions
appointed-0.027
chair-0.026
unct-0.025
 halls-0.024
irds-0.024
Token asked
Token ID1965
Feature activation+6.51e-03

Pos logit contributions
 Wally+0.053
 fossil+0.051
 tuna+0.051
 shrimp+0.049
 fisher+0.049
Neg logit contributions
appointed-0.062
chair-0.057
ring-0.055
clock-0.053
unct-0.052
Token the
Token ID262
Feature activation+9.02e-03

Pos logit contributions
 Wally+0.009
 tuna+0.009
 Sebastian+0.009
 shrimp+0.009
 fisher+0.009
Neg logit contributions
appointed-0.012
chair-0.011
unct-0.010
irds-0.010
 halls-0.010
Token man
Token ID582
Feature activation+4.13e-03

Pos logit contributions
 Wally+0.016
 tuna+0.015
 Sebastian+0.015
 shrimp+0.014
 Finn+0.014
Neg logit contributions
chair-0.013
appointed-0.013
 halls-0.012
unct-0.012
irds-0.011
Token.
Token ID13
Feature activation+4.52e-02

Pos logit contributions
 Wally+0.006
 tuna+0.006
 Sebastian+0.006
 shrimp+0.006
 fisher+0.005
Neg logit contributions
appointed-0.007
chair-0.007
unct-0.007
irds-0.007
ring-0.007
Token.
Token ID13
Feature activation-2.59e-02

Pos logit contributions
appointed+0.051
chair+0.048
unct+0.046
 halls+0.045
irds+0.045
Neg logit contributions
 Wally-0.029
 tuna-0.027
 shrimp-0.026
 Sebastian-0.026
 fisher-0.025
Token The
Token ID383
Feature activation-4.72e-03

Pos logit contributions
appointed+0.037
unct+0.035
chair+0.035
 halls+0.033
irds+0.032
Neg logit contributions
 Wally-0.050
 tuna-0.047
 Sebastian-0.045
 shrimp-0.044
 Finn-0.044
Token snake
Token ID17522
Feature activation+2.89e-03

Pos logit contributions
appointed+0.010
chair+0.009
unct+0.009
irds+0.009
 halls+0.008
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.006
 Sebastian-0.006
 fisher-0.005
Token calmed
Token ID49566
Feature activation-2.35e-02

Pos logit contributions
 Wally+0.005
 tuna+0.004
 Sebastian+0.004
 shrimp+0.004
 fisher+0.004
Neg logit contributions
appointed-0.005
chair-0.005
unct-0.004
irds-0.004
 halls-0.004
Token down
Token ID866
Feature activation-2.46e-04

Pos logit contributions
appointed+0.031
chair+0.029
 halls+0.027
irds+0.026
amas+0.026
Neg logit contributions
 Wally-0.048
 tuna-0.044
 Sebastian-0.044
 shrimp-0.044
 Finn-0.043
Token and
Token ID290
Feature activation-9.15e-03

Pos logit contributions
appointed+0.001
chair+0.001
unct+0.001
irds+0.001
 halls+0.000
Neg logit contributions
 Wally-0.000
 tuna-0.000
 Sebastian-0.000
 shrimp-0.000
 fisher-0.000
Token sn
Token ID3013
Feature activation-1.40e-02

Pos logit contributions
appointed+0.017
chair+0.016
unct+0.015
irds+0.015
 halls+0.015
Neg logit contributions
 Wally-0.013
 tuna-0.013
 shrimp-0.012
 Sebastian-0.012
 fisher-0.012
Tokenugg
Token ID6837
Feature activation-1.81e-02

Pos logit contributions
appointed+0.019
chair+0.018
unct+0.016
 halls+0.016
irds+0.016
Neg logit contributions
 Wally-0.028
 tuna-0.026
 Sebastian-0.026
 shrimp-0.026
 fisher-0.025
Tokenled
Token ID992
Feature activation-3.04e-04

Pos logit contributions
appointed+0.026
chair+0.025
irds+0.023
unct+0.023
flash+0.022
Neg logit contributions
 Wally-0.034
 tuna-0.033
 shrimp-0.032
 Sebastian-0.031
 fisher-0.030
Token further
Token ID2252
Feature activation+4.81e-03

Pos logit contributions
appointed+0.001
chair+0.000
unct+0.000
irds+0.000
 halls+0.000
Neg logit contributions
 Wally-0.000
 tuna-0.000
 Sebastian-0.000
 shrimp-0.000
 fisher-0.000
Token into
Token ID656
Feature activation+1.90e-04

Pos logit contributions
 Wally+0.008
 tuna+0.008
 Sebastian+0.007
 shrimp+0.007
 fisher+0.007
Neg logit contributions
appointed-0.008
chair-0.008
unct-0.007
 halls-0.007
irds-0.007
Token The
Token ID383
Feature activation+2.91e-03

Pos logit contributions
 fisher+0.046
 Wally+0.046
 shrimp+0.045
 tuna+0.045
 fossil+0.044
Neg logit contributions
appointed-0.032
chair-0.032
 Lightning-0.028
irds-0.027
flash-0.027
Token rock
Token ID3881
Feature activation-1.92e-03

Pos logit contributions
 Wally+0.005
 tuna+0.004
 Sebastian+0.004
 shrimp+0.004
 fisher+0.004
Neg logit contributions
appointed-0.005
chair-0.005
unct-0.004
irds-0.004
 halls-0.004
Token flies
Token ID17607
Feature activation-3.52e-03

Pos logit contributions
appointed+0.003
chair+0.003
unct+0.003
irds+0.002
 halls+0.002
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 shrimp-0.003
 fisher-0.003
Token up
Token ID510
Feature activation+3.36e-02

Pos logit contributions
appointed+0.006
chair+0.006
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token and
Token ID290
Feature activation+1.16e-02

Pos logit contributions
 Wally+0.050
 Sebastian+0.049
 tuna+0.048
 fisher+0.047
 shrimp+0.047
Neg logit contributions
appointed-0.058
chair-0.054
clock-0.051
 ceiling-0.051
flash-0.051
Token hits
Token ID7127
Feature activation+8.68e-03

Pos logit contributions
 Wally+0.024
 Sebastian+0.023
 fisher+0.023
 tuna+0.023
 shrimp+0.023
Neg logit contributions
appointed-0.013
chair-0.013
unct-0.012
 halls-0.011
flash-0.011
Token Tim
Token ID5045
Feature activation-2.51e-02

Pos logit contributions
 Wally+0.012
 tuna+0.012
 fisher+0.012
 shrimp+0.012
 Sebastian+0.011
Neg logit contributions
appointed-0.016
chair-0.015
flash-0.014
unct-0.014
irds-0.014
Token in
Token ID287
Feature activation+1.67e-02

Pos logit contributions
appointed+0.063
chair+0.060
unct+0.058
irds+0.058
 halls+0.058
Neg logit contributions
 Wally-0.019
 tuna-0.018
 fisher-0.015
 shrimp-0.015
 Sebastian-0.015
Token the
Token ID262
Feature activation-1.18e-02

Pos logit contributions
 Wally+0.029
 shrimp+0.027
 fisher+0.027
 Sebastian+0.027
 tuna+0.027
Neg logit contributions
appointed-0.024
chair-0.024
 ceiling-0.022
 halls-0.021
flash-0.021
Token head
Token ID1182
Feature activation-1.67e-02

Pos logit contributions
appointed+0.016
chair+0.015
irds+0.014
unct+0.014
ring+0.013
Neg logit contributions
 Wally-0.023
 tuna-0.023
 shrimp-0.022
 fisher-0.021
 Sebastian-0.021
Token.
Token ID13
Feature activation+2.67e-02

Pos logit contributions
appointed+0.034
chair+0.033
unct+0.032
irds+0.031
 halls+0.031
Neg logit contributions
 Wally-0.020
 tuna-0.019
 shrimp-0.018
 fisher-0.018
 Sebastian-0.018
Token Then
Token ID3244
Feature activation-2.40e-03

Pos logit contributions
 Wally+0.013
 tuna+0.013
 fisher+0.012
 Sebastian+0.012
 shrimp+0.012
Neg logit contributions
appointed-0.013
chair-0.012
irds-0.012
 halls-0.011
unct-0.011
Token,
Token ID11
Feature activation-3.69e-03

Pos logit contributions
appointed+0.005
chair+0.005
unct+0.004
irds+0.004
 halls+0.004
Neg logit contributions
 Wally-0.003
 tuna-0.003
 shrimp-0.003
 Sebastian-0.003
 fisher-0.003
Token they
Token ID484
Feature activation+1.38e-02

Pos logit contributions
appointed+0.007
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.005
 tuna-0.005
 shrimp-0.004
 Sebastian-0.004
 fisher-0.004
Token spotted
Token ID13489
Feature activation+1.04e-02

Pos logit contributions
 Wally+0.024
 tuna+0.023
 Sebastian+0.022
 fisher+0.022
 fossil+0.022
Neg logit contributions
appointed-0.021
chair-0.020
unct-0.019
 halls-0.018
irds-0.018
Token a
Token ID257
Feature activation+2.86e-02

Pos logit contributions
 Wally+0.017
 tuna+0.016
 fisher+0.016
 Sebastian+0.016
 shrimp+0.015
Neg logit contributions
appointed-0.016
chair-0.015
unct-0.014
irds-0.014
 halls-0.014
Token giant
Token ID6175
Feature activation+3.64e-02

Pos logit contributions
 Wally+0.052
 Sebastian+0.050
 fisher+0.048
 Finn+0.048
 tuna+0.047
Neg logit contributions
chair-0.039
appointed-0.038
 roof-0.034
 ceiling-0.034
unct-0.033
Token pile
Token ID14540
Feature activation+4.28e-02

Pos logit contributions
 Wally+0.059
 Sebastian+0.057
 fisher+0.054
 Finn+0.053
 tuna+0.052
Neg logit contributions
chair-0.061
appointed-0.054
 roof-0.051
unct-0.051
 ceiling-0.050
Token of
Token ID286
Feature activation+3.07e-02

Pos logit contributions
 Wally+0.085
 Sebastian+0.082
 tuna+0.081
 fisher+0.080
 fossil+0.079
Neg logit contributions
chair-0.054
appointed-0.051
ring-0.048
iece-0.047
 halls-0.045
Token leaves
Token ID5667
Feature activation+3.01e-02

Pos logit contributions
 Wally+0.049
 Sebastian+0.047
 fisher+0.047
 Crab+0.043
 Finn+0.043
Neg logit contributions
chair-0.047
appointed-0.045
 halls-0.043
 roof-0.042
 floors-0.042
Token.
Token ID13
Feature activation+2.37e-02

Pos logit contributions
 Wally+0.047
 fisher+0.045
 Sebastian+0.045
 tuna+0.044
 fossil+0.043
Neg logit contributions
appointed-0.051
chair-0.050
ring-0.045
unct-0.044
iece-0.044
Token
Token ID198
Feature activation+2.07e-02

Pos logit contributions
 Wally+0.039
 tuna+0.039
 fisher+0.038
 fossil+0.037
 Sebastian+0.037
Neg logit contributions
appointed-0.035
chair-0.033
irds-0.031
 halls-0.031
amas-0.030
Token of
Token ID286
Feature activation+2.52e-02

Pos logit contributions
 Wally+0.018
 Sebastian+0.018
 tuna+0.017
 fisher+0.017
 shrimp+0.016
Neg logit contributions
chair-0.017
appointed-0.016
 halls-0.014
irds-0.014
filled-0.014
Token all
Token ID477
Feature activation-8.70e-03

Pos logit contributions
 fisher+0.048
 Wally+0.046
 fossil+0.046
 Sebastian+0.046
 tuna+0.045
Neg logit contributions
appointed-0.033
chair-0.032
 halls-0.029
 ceiling-0.028
 roof-0.028
Token,
Token ID11
Feature activation-1.20e-02

Pos logit contributions
appointed+0.017
chair+0.016
unct+0.016
irds+0.016
 halls+0.015
Neg logit contributions
 Wally-0.011
 tuna-0.011
 shrimp-0.011
 Sebastian-0.010
 fisher-0.010
Token she
Token ID673
Feature activation+2.40e-02

Pos logit contributions
appointed+0.019
chair+0.018
unct+0.017
irds+0.017
 halls+0.016
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.020
 Sebastian-0.019
 fisher-0.019
Token was
Token ID373
Feature activation-4.25e-03

Pos logit contributions
 Wally+0.043
 tuna+0.041
 Sebastian+0.040
 fisher+0.040
 shrimp+0.039
Neg logit contributions
chair-0.036
appointed-0.034
unct-0.032
 halls-0.031
ring-0.030
Token sad
Token ID6507
Feature activation+6.22e-03

Pos logit contributions
appointed+0.008
chair+0.008
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token because
Token ID780
Feature activation-9.06e-03

Pos logit contributions
 Wally+0.008
 tuna+0.008
 fisher+0.008
 Sebastian+0.008
 shrimp+0.008
Neg logit contributions
chair-0.012
appointed-0.012
unct-0.011
 halls-0.011
irds-0.011
Token she
Token ID673
Feature activation-2.78e-03

Pos logit contributions
appointed+0.012
chair+0.011
unct+0.011
irds+0.010
 halls+0.010
Neg logit contributions
 Wally-0.018
 tuna-0.017
 shrimp-0.017
 Sebastian-0.017
 fisher-0.016
Token missed
Token ID6825
Feature activation+5.45e-03

Pos logit contributions
appointed+0.005
chair+0.005
unct+0.005
irds+0.004
 halls+0.004
Neg logit contributions
 Wally-0.004
 tuna-0.004
 shrimp-0.004
 Sebastian-0.004
 fisher-0.004
Token her
Token ID607
Feature activation+7.59e-04

Pos logit contributions
 Wally+0.011
 tuna+0.010
 fisher+0.010
 Sebastian+0.010
 shrimp+0.010
Neg logit contributions
appointed-0.007
chair-0.007
unct-0.006
 halls-0.006
irds-0.006
Token home
Token ID1363
Feature activation-5.29e-02

Pos logit contributions
 Wally+0.001
 tuna+0.001
 Sebastian+0.001
 fisher+0.001
 shrimp+0.001
Neg logit contributions
appointed-0.001
chair-0.001
unct-0.001
 halls-0.001
irds-0.001
Token too
Token ID1165
Feature activation-6.24e-03

Pos logit contributions
appointed+0.035
chair+0.034
unct+0.032
irds+0.031
 halls+0.031
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token.
Token ID13
Feature activation-1.51e-02

Pos logit contributions
appointed+0.012
chair+0.011
unct+0.011
irds+0.010
 halls+0.010
Neg logit contributions
 Wally-0.009
 tuna-0.008
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token The
Token ID383
Feature activation-9.37e-03

Pos logit contributions
appointed+0.023
chair+0.022
unct+0.022
irds+0.020
 halls+0.020
Neg logit contributions
 Wally-0.027
 tuna-0.026
 Sebastian-0.025
 shrimp-0.025
 fisher-0.024
Token little
Token ID1310
Feature activation-1.02e-02

Pos logit contributions
appointed+0.013
chair+0.012
unct+0.011
irds+0.011
 halls+0.011
Neg logit contributions
 Wally-0.018
 tuna-0.018
 shrimp-0.017
 Sebastian-0.017
 fisher-0.017
Token girl
Token ID2576
Feature activation-1.65e-02

Pos logit contributions
appointed+0.015
chair+0.014
unct+0.013
irds+0.013
 halls+0.012
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.018
 Sebastian-0.018
 fisher-0.017
Token couldn
Token ID3521
Feature activation-3.69e-02

Pos logit contributions
appointed+0.031
chair+0.029
unct+0.027
irds+0.027
 halls+0.026
Neg logit contributions
 Wally-0.024
 tuna-0.023
 shrimp-0.022
 Sebastian-0.022
 fisher-0.021
Token't
Token ID470
Feature activation+1.21e-02

Pos logit contributions
appointed+0.059
irds+0.056
 halls+0.056
chair+0.055
unct+0.055
Neg logit contributions
 Wally-0.063
 tuna-0.058
 Sebastian-0.057
 Fin-0.055
 Finn-0.055
Token believe
Token ID1975
Feature activation+1.04e-02

Pos logit contributions
 Wally+0.022
 tuna+0.021
 Sebastian+0.021
 shrimp+0.020
 fisher+0.020
Neg logit contributions
appointed-0.018
chair-0.017
unct-0.016
ring-0.016
 roof-0.015
Token it
Token ID340
Feature activation-7.94e-03

Pos logit contributions
 Wally+0.017
 tuna+0.017
 fisher+0.016
 Sebastian+0.016
 shrimp+0.016
Neg logit contributions
appointed-0.016
chair-0.016
unct-0.015
irds-0.014
 halls-0.014
Token -
Token ID532
Feature activation-2.02e-02

Pos logit contributions
appointed+0.016
chair+0.015
unct+0.014
irds+0.014
 halls+0.014
Neg logit contributions
 Wally-0.010
 tuna-0.010
 shrimp-0.010
 Sebastian-0.009
 fisher-0.009
Token the
Token ID262
Feature activation-9.71e-03

Pos logit contributions
appointed+0.039
chair+0.037
unct+0.035
irds+0.034
amas+0.034
Neg logit contributions
 Wally-0.028
 tuna-0.027
 shrimp-0.026
 Sebastian-0.026
 fisher-0.025
Token books
Token ID3835
Feature activation-1.05e-02

Pos logit contributions
appointed+0.054
irds+0.049
unct+0.049
ring+0.046
rolling+0.045
Neg logit contributions
 tuna-0.059
 Wally-0.057
 shrimp-0.055
 fossil-0.053
 Sebastian-0.051
Token.
Token ID13
Feature activation-1.73e-02

Pos logit contributions
appointed+0.022
chair+0.021
unct+0.020
irds+0.020
 halls+0.020
Neg logit contributions
 Wally-0.012
 tuna-0.012
 shrimp-0.011
 Sebastian-0.011
 fisher-0.011
Token Lily
Token ID20037
Feature activation-3.89e-02

Pos logit contributions
appointed+0.034
unct+0.033
chair+0.032
irds+0.032
ring+0.031
Neg logit contributions
 Wally-0.023
 tuna-0.022
 Sebastian-0.021
 shrimp-0.021
 Finn-0.019
Token's
Token ID338
Feature activation-4.55e-02

Pos logit contributions
appointed+0.081
irds+0.077
unct+0.075
chair+0.074
filled+0.073
Neg logit contributions
 Wally-0.041
 tuna-0.040
 named-0.039
 Sebastian-0.036
 fisher-0.036
Token eyes
Token ID2951
Feature activation-7.73e-02

Pos logit contributions
appointed+0.096
unct+0.087
irds+0.086
ring+0.085
filled+0.084
Neg logit contributions
 tuna-0.060
 Wally-0.055
 shrimp-0.053
 fossil-0.049
 fisher-0.047
Token got
Token ID1392
Feature activation-4.58e-02

Pos logit contributions
irds+0.155
unct+0.152
appointed+0.145
EEP+0.143
oga+0.141
Neg logit contributions
 like-0.084
 tuna-0.078
 named-0.073
 and-0.072
 shrimp-0.071
Token big
Token ID1263
Feature activation+7.11e-03

Pos logit contributions
appointed+0.094
irds+0.092
chair+0.089
EEP+0.086
unct+0.085
Neg logit contributions
 tuna-0.056
 shrimp-0.053
 Wally-0.052
 Sebastian-0.049
 like-0.044
Token when
Token ID618
Feature activation-5.40e-02

Pos logit contributions
 Wally+0.009
 tuna+0.008
 Sebastian+0.008
 fisher+0.008
 shrimp+0.008
Neg logit contributions
appointed-0.015
chair-0.015
unct-0.014
 halls-0.013
ring-0.013
Token she
Token ID673
Feature activation-3.95e-02

Pos logit contributions
appointed+0.105
ring+0.103
irds+0.098
unct+0.097
age+0.094
Neg logit contributions
 Wally-0.081
 tuna-0.077
 shrimp-0.074
 Sebastian-0.071
 Otto-0.067
Token saw
Token ID2497
Feature activation-3.88e-02

Pos logit contributions
appointed+0.091
irds+0.084
chair+0.081
unct+0.080
amas+0.079
Neg logit contributions
 tuna-0.039
 Wally-0.037
 named-0.036
 shrimp-0.035
 fisher-0.033
Token all
Token ID477
Feature activation-3.78e-02

Pos logit contributions
appointed+0.089
unct+0.081
irds+0.080
EEP+0.078
amas+0.077
Neg logit contributions
 tuna-0.047
 Wally-0.045
 shrimp-0.044
 Sebastian-0.042
 Crab-0.037
Token.
Token ID13
Feature activation+9.08e-03

Pos logit contributions
 Sebastian+0.065
 Wally+0.065
 fisher+0.065
 fossil+0.062
 tuna+0.061
Neg logit contributions
chair-0.033
appointed-0.030
oga-0.029
clock-0.029
 halls-0.029
Token "
Token ID366
Feature activation+1.09e-02

Pos logit contributions
 Wally+0.013
 fisher+0.012
 tuna+0.012
 shrimp+0.012
 Sebastian+0.012
Neg logit contributions
appointed-0.017
chair-0.017
 halls-0.015
amas-0.014
irds-0.014
TokenWhat
Token ID2061
Feature activation-2.32e-02

Pos logit contributions
 Wally+0.017
 tuna+0.017
 fisher+0.017
 Sebastian+0.016
 shrimp+0.016
Neg logit contributions
appointed-0.019
chair-0.018
unct-0.016
irds-0.016
 halls-0.016
Token's
Token ID338
Feature activation-1.81e-03

Pos logit contributions
appointed+0.038
unct+0.033
chair+0.033
irds+0.031
 halls+0.031
Neg logit contributions
 Wally-0.041
 tuna-0.040
 shrimp-0.039
 fisher-0.038
 Sebastian-0.037
Token in
Token ID287
Feature activation+1.54e-02

Pos logit contributions
appointed+0.003
chair+0.003
unct+0.003
irds+0.003
 halls+0.003
Neg logit contributions
 Wally-0.003
 tuna-0.003
 shrimp-0.003
 Sebastian-0.003
 fisher-0.003
Token the
Token ID262
Feature activation-1.82e-02

Pos logit contributions
 Wally+0.023
 fisher+0.021
 Sebastian+0.021
 tuna+0.021
 Finn+0.021
Neg logit contributions
chair-0.026
appointed-0.025
unct-0.023
 halls-0.023
flash-0.023
Token box
Token ID3091
Feature activation-6.83e-03

Pos logit contributions
appointed+0.035
chair+0.033
irds+0.033
unct+0.032
ring+0.031
Neg logit contributions
 tuna-0.025
 Wally-0.024
 shrimp-0.023
 fossil-0.023
 Sebastian-0.022
Token?"
Token ID1701
Feature activation+2.14e-02

Pos logit contributions
appointed+0.016
chair+0.015
unct+0.014
irds+0.014
 halls+0.014
Neg logit contributions
 Wally-0.007
 tuna-0.007
 shrimp-0.006
 Sebastian-0.006
 fisher-0.006
Token he
Token ID339
Feature activation-3.56e-03

Pos logit contributions
 fisher+0.034
 Wally+0.033
 tuna+0.032
 fossil+0.032
 shrimp+0.032
Neg logit contributions
chair-0.036
appointed-0.035
 halls-0.029
unct-0.029
cases-0.028
Token asked
Token ID1965
Feature activation-9.60e-04

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.004
 tuna-0.004
 shrimp-0.004
 Sebastian-0.004
 fisher-0.004
Token.
Token ID13
Feature activation+1.06e-02

Pos logit contributions
appointed+0.002
chair+0.002
unct+0.002
irds+0.002
 halls+0.002
Neg logit contributions
 Wally-0.001
 tuna-0.001
 shrimp-0.001
 Sebastian-0.001
 fisher-0.001
Token on
Token ID319
Feature activation-1.21e-02

Pos logit contributions
appointed+0.062
unct+0.059
chair+0.058
irds+0.057
rolling+0.056
Neg logit contributions
 Wally-0.034
 tuna-0.032
 shrimp-0.031
 Finn-0.029
 Sebastian-0.028
Token her
Token ID607
Feature activation-5.14e-02

Pos logit contributions
appointed+0.019
chair+0.018
unct+0.017
irds+0.017
 halls+0.016
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.020
 Sebastian-0.019
 fisher-0.019
Token face
Token ID1986
Feature activation-6.26e-02

Pos logit contributions
appointed+0.092
chair+0.076
unct+0.075
limits+0.075
oga+0.074
Neg logit contributions
 tuna-0.078
 shrimp-0.076
 fish-0.072
 Wally-0.070
 baseball-0.070
Token and
Token ID290
Feature activation-3.34e-02

Pos logit contributions
appointed+0.164
unct+0.160
irds+0.151
chair+0.149
 halls+0.149
Neg logit contributions
 tuna-0.039
 Wally-0.034
 like-0.032
 shrimp-0.031
 Sebastian-0.028
Token legs
Token ID7405
Feature activation-5.50e-02

Pos logit contributions
appointed+0.062
chair+0.057
amas+0.056
irds+0.055
unct+0.055
Neg logit contributions
 Wally-0.049
 tuna-0.048
 shrimp-0.046
 Finn-0.044
 Sebastian-0.044
Token.
Token ID13
Feature activation-2.84e-02

Pos logit contributions
appointed+0.130
chair+0.127
irds+0.126
unct+0.126
 halls+0.120
Neg logit contributions
 tuna-0.040
 Wally-0.040
 like-0.039
 shrimp-0.036
 Sebastian-0.034
Token She
Token ID1375
Feature activation-9.95e-04

Pos logit contributions
appointed+0.045
unct+0.045
irds+0.042
chair+0.042
ring+0.041
Neg logit contributions
 Wally-0.050
 tuna-0.047
 Finn-0.045
 Sebastian-0.044
 shrimp-0.044
Token laughed
Token ID13818
Feature activation-1.12e-02

Pos logit contributions
appointed+0.002
chair+0.002
unct+0.002
irds+0.001
 halls+0.001
Neg logit contributions
 Wally-0.002
 tuna-0.002
 Sebastian-0.001
 shrimp-0.001
 fisher-0.001
Token and
Token ID290
Feature activation-6.82e-03

Pos logit contributions
appointed+0.027
chair+0.027
unct+0.026
irds+0.025
 halls+0.025
Neg logit contributions
 Wally-0.010
 tuna-0.009
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token sang
Token ID25889
Feature activation-4.23e-02

Pos logit contributions
appointed+0.012
chair+0.012
unct+0.011
irds+0.011
 halls+0.011
Neg logit contributions
 Wally-0.010
 tuna-0.010
 shrimp-0.010
 Sebastian-0.009
 fisher-0.009
Token.
Token ID13
Feature activation-2.44e-02

Pos logit contributions
appointed+0.096
unct+0.095
chair+0.091
bars+0.090
irds+0.088
Neg logit contributions
 Wally-0.043
 tuna-0.041
 like-0.039
 shrimp-0.037
 Finn-0.036
Token hands
Token ID2832
Feature activation-2.07e-06

Pos logit contributions
 Wally+0.006
 tuna+0.006
 shrimp+0.006
 fisher+0.006
 Sebastian+0.006
Neg logit contributions
appointed-0.006
chair-0.006
unct-0.005
irds-0.005
ring-0.005
Token and
Token ID290
Feature activation-2.59e-02

Feature not active on prev token;
no predictions were affected.
Token scattered
Token ID16830
Feature activation+2.19e-02

Pos logit contributions
appointed+0.052
chair+0.048
unct+0.046
irds+0.046
amas+0.045
Neg logit contributions
 tuna-0.035
 Wally-0.035
 shrimp-0.033
 Sebastian-0.031
 Finn-0.030
Token it
Token ID340
Feature activation-2.39e-02

Pos logit contributions
 Wally+0.037
 fisher+0.035
 Sebastian+0.035
 tuna+0.034
 shrimp+0.034
Neg logit contributions
chair-0.034
appointed-0.033
 halls-0.030
iece-0.029
 floors-0.029
Token everywhere
Token ID8347
Feature activation-1.22e-02

Pos logit contributions
appointed+0.047
chair+0.045
irds+0.043
unct+0.043
 halls+0.041
Neg logit contributions
 Wally-0.032
 tuna-0.030
 shrimp-0.028
 Sebastian-0.027
 Finn-0.027
Token.
Token ID13
Feature activation-2.39e-02

Pos logit contributions
appointed+0.025
chair+0.023
unct+0.022
irds+0.022
 halls+0.022
Neg logit contributions
 Wally-0.016
 tuna-0.015
 shrimp-0.014
 Sebastian-0.014
 fisher-0.014
Token
Token ID198
Feature activation+7.10e-03

Pos logit contributions
appointed+0.050
unct+0.049
chair+0.047
irds+0.047
ring+0.046
Neg logit contributions
 Wally-0.029
 tuna-0.028
 Sebastian-0.026
 Finn-0.025
 Otto-0.024
Token
Token ID198
Feature activation-5.69e-03

Pos logit contributions
 Wally+0.010
 tuna+0.010
 shrimp+0.010
 fisher+0.009
 Sebastian+0.009
Neg logit contributions
appointed-0.013
chair-0.013
irds-0.012
unct-0.012
 halls-0.012
TokenSuddenly
Token ID38582
Feature activation-1.70e-02

Pos logit contributions
appointed+0.009
chair+0.009
unct+0.008
irds+0.008
 halls+0.008
Neg logit contributions
 Wally-0.010
 tuna-0.009
 shrimp-0.009
 Sebastian-0.009
 fisher-0.009
Token,
Token ID11
Feature activation-1.58e-02

Pos logit contributions
appointed+0.039
chair+0.037
unct+0.035
irds+0.035
ring+0.034
Neg logit contributions
 Wally-0.019
 tuna-0.018
 Sebastian-0.017
 shrimp-0.017
 fisher-0.016
Token his
Token ID465
Feature activation-3.22e-02

Pos logit contributions
appointed+0.035
chair+0.034
unct+0.032
irds+0.032
ring+0.032
Neg logit contributions
 Wally-0.018
 tuna-0.017
 Sebastian-0.016
 shrimp-0.016
 fisher-0.015
Token her
Token ID607
Feature activation+9.80e-03

Pos logit contributions
appointed+0.036
chair+0.033
unct+0.032
amas+0.031
 halls+0.030
Neg logit contributions
 Wally-0.031
 shrimp-0.029
 tuna-0.029
 Sebastian-0.028
 fisher-0.028
Token magical
Token ID10883
Feature activation+1.50e-02

Pos logit contributions
 Wally+0.016
 Sebastian+0.015
 tuna+0.015
 shrimp+0.014
 fisher+0.014
Neg logit contributions
appointed-0.016
chair-0.016
irds-0.014
amas-0.014
unct-0.014
Token whip
Token ID22972
Feature activation-3.77e-02

Pos logit contributions
 Wally+0.024
 tuna+0.022
 Sebastian+0.022
 Finn+0.022
 shrimp+0.022
Neg logit contributions
appointed-0.025
chair-0.025
irds-0.023
amas-0.022
unct-0.022
Token around
Token ID1088
Feature activation-3.63e-03

Pos logit contributions
appointed+0.088
unct+0.083
chair+0.083
 halls+0.083
amas+0.082
Neg logit contributions
 Wally-0.035
 tuna-0.033
 shrimp-0.030
 fisher-0.028
 Sebastian-0.028
Token and
Token ID290
Feature activation-3.94e-02

Pos logit contributions
appointed+0.006
chair+0.006
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token made
Token ID925
Feature activation-9.06e-02

Pos logit contributions
appointed+0.084
amas+0.080
chair+0.080
irds+0.078
unct+0.076
Neg logit contributions
 Wally-0.046
 tuna-0.041
 Sebastian-0.041
 shrimp-0.040
 Finn-0.039
Token funny
Token ID8258
Feature activation-5.72e-02

Pos logit contributions
appointed+0.231
chair+0.205
unct+0.205
EEP+0.202
rated+0.199
Neg logit contributions
 tuna-0.088
 Wally-0.080
 shrimp-0.078
 Finn-0.074
 fish-0.071
Token shapes
Token ID15268
Feature activation-1.56e-02

Pos logit contributions
appointed+0.083
unct+0.070
chair+0.068
flash+0.065
bars+0.065
Neg logit contributions
 tuna-0.114
 shrimp-0.113
 fish-0.108
 fisher-0.104
 Wally-0.103
Token in
Token ID287
Feature activation+1.96e-02

Pos logit contributions
appointed+0.033
chair+0.031
unct+0.030
irds+0.029
 halls+0.029
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.018
 fisher-0.017
 Sebastian-0.017
Token the
Token ID262
Feature activation-1.45e-02

Pos logit contributions
 Wally+0.032
 fisher+0.031
 tuna+0.031
 Sebastian+0.030
 shrimp+0.030
Neg logit contributions
appointed-0.029
chair-0.029
 halls-0.027
 ceiling-0.027
 roof-0.027
Token air
Token ID1633
Feature activation-1.71e-02

Pos logit contributions
appointed+0.022
chair+0.021
irds+0.020
unct+0.020
ring+0.019
Neg logit contributions
 Wally-0.026
 tuna-0.026
 shrimp-0.025
 fisher-0.024
 Sebastian-0.024
Token store
Token ID3650
Feature activation+5.00e-03

Pos logit contributions
appointed+0.025
chair+0.024
irds+0.023
unct+0.022
ring+0.022
Neg logit contributions
 Wally-0.023
 tuna-0.023
 shrimp-0.022
 Sebastian-0.021
 fisher-0.021
Token to
Token ID284
Feature activation-4.74e-03

Pos logit contributions
 Wally+0.007
 tuna+0.007
 shrimp+0.007
 fisher+0.006
 Sebastian+0.006
Neg logit contributions
appointed-0.009
chair-0.009
 halls-0.008
unct-0.008
irds-0.008
Token buy
Token ID2822
Feature activation-3.10e-03

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.008
 tuna-0.008
 shrimp-0.007
 Sebastian-0.007
 fisher-0.007
Token some
Token ID617
Feature activation+2.48e-02

Pos logit contributions
appointed+0.005
chair+0.005
unct+0.005
irds+0.005
 halls+0.004
Neg logit contributions
 Wally-0.005
 tuna-0.005
 Sebastian-0.005
 shrimp-0.005
 fisher-0.005
Token soap
Token ID19533
Feature activation+5.26e-02

Pos logit contributions
 Wally+0.047
 Sebastian+0.046
 fisher+0.044
 Crab+0.043
 fossil+0.042
Neg logit contributions
chair-0.034
appointed-0.030
up-0.029
 halls-0.028
 ceiling-0.028
Token.
Token ID13
Feature activation-7.53e-02

Pos logit contributions
 Wally+0.090
 fisher+0.089
 fossil+0.087
 Sebastian+0.086
 fisherman+0.082
Neg logit contributions
chair-0.077
appointed-0.068
ring-0.067
bows-0.066
 ceiling-0.066
Token She
Token ID1375
Feature activation+6.25e-03

Pos logit contributions
unct+0.163
uch+0.152
ics+0.148
vation+0.146
ong+0.144
Neg logit contributions
 Wally-0.077
 tuna-0.069
 like-0.069
 named-0.068
 and-0.066
Token bought
Token ID5839
Feature activation-9.94e-03

Pos logit contributions
 Wally+0.010
 tuna+0.009
 Sebastian+0.009
 shrimp+0.009
 fisher+0.009
Neg logit contributions
appointed-0.011
chair-0.011
unct-0.010
 halls-0.010
irds-0.009
Token the
Token ID262
Feature activation+1.79e-02

Pos logit contributions
appointed+0.018
chair+0.018
unct+0.017
irds+0.016
 halls+0.016
Neg logit contributions
 Wally-0.015
 tuna-0.014
 shrimp-0.014
 Sebastian-0.014
 fisher-0.013
Token soap
Token ID19533
Feature activation+4.41e-02

Pos logit contributions
 Wally+0.026
 Sebastian+0.024
 fisher+0.024
 fossil+0.023
 Crab+0.023
Neg logit contributions
appointed-0.032
chair-0.030
 ceiling-0.029
amas-0.028
 halls-0.028
Token and
Token ID290
Feature activation-5.61e-03

Pos logit contributions
 Wally+0.058
 fisher+0.057
 Sebastian+0.055
 fossil+0.054
 tuna+0.051
Neg logit contributions
chair-0.081
appointed-0.076
 ceiling-0.074
ring-0.074
 roof-0.073
Token But
Token ID887
Feature activation-1.17e-02

Pos logit contributions
appointed+0.062
unct+0.061
vation+0.057
rated+0.056
filled+0.055
Neg logit contributions
 Wally-0.097
 tuna-0.093
 shrimp-0.091
 Sebastian-0.087
 named-0.086
Token Lily
Token ID20037
Feature activation+1.61e-02

Pos logit contributions
appointed+0.020
chair+0.019
unct+0.018
irds+0.017
 halls+0.017
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.018
 Sebastian-0.018
 fisher-0.017
Token didn
Token ID1422
Feature activation-1.38e-02

Pos logit contributions
 Wally+0.027
 tuna+0.027
 shrimp+0.026
 Sebastian+0.026
 fisher+0.025
Neg logit contributions
appointed-0.026
chair-0.024
unct-0.023
irds-0.021
ring-0.021
Token't
Token ID470
Feature activation+1.30e-02

Pos logit contributions
appointed+0.018
chair+0.017
unct+0.016
irds+0.015
 halls+0.015
Neg logit contributions
 Wally-0.028
 tuna-0.027
 Sebastian-0.026
 shrimp-0.026
 fisher-0.025
Token listen
Token ID6004
Feature activation-3.63e-03

Pos logit contributions
 tuna+0.024
 Wally+0.024
 Sebastian+0.023
 shrimp+0.023
 fossil+0.022
Neg logit contributions
appointed-0.018
chair-0.017
ring-0.016
irds-0.016
unct-0.016
Token.
Token ID13
Feature activation-8.46e-02

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.004
 tuna-0.004
 shrimp-0.004
 Sebastian-0.004
 fisher-0.004
Token She
Token ID1375
Feature activation+1.68e-02

Pos logit contributions
vation+0.183
unct+0.179
uch+0.168
filled+0.164
z+0.160
Neg logit contributions
 Wally-0.101
 tuna-0.091
 Fin-0.086
 shrimp-0.085
 Pete-0.083
Token wanted
Token ID2227
Feature activation+1.95e-02

Pos logit contributions
 Wally+0.027
 tuna+0.026
 shrimp+0.025
 Sebastian+0.025
 fisher+0.025
Neg logit contributions
appointed-0.028
chair-0.027
unct-0.024
 halls-0.024
irds-0.024
Token to
Token ID284
Feature activation+1.30e-02

Pos logit contributions
 Wally+0.036
 fisher+0.035
 Sebastian+0.034
 tuna+0.034
 fossil+0.034
Neg logit contributions
chair-0.024
appointed-0.024
 halls-0.022
clock-0.022
filled-0.021
Token talk
Token ID1561
Feature activation-6.31e-03

Pos logit contributions
 Wally+0.023
 tuna+0.023
 shrimp+0.022
 Sebastian+0.022
 fisher+0.022
Neg logit contributions
appointed-0.019
chair-0.017
 roof-0.016
unct-0.016
irds-0.016
Token to
Token ID284
Feature activation-1.23e-02

Pos logit contributions
appointed+0.011
chair+0.010
unct+0.010
irds+0.009
 halls+0.009
Neg logit contributions
 Wally-0.010
 tuna-0.010
 shrimp-0.009
 Sebastian-0.009
 fisher-0.009
Token laughed
Token ID13818
Feature activation+4.76e-04

Pos logit contributions
 Wally+0.019
 tuna+0.018
 Sebastian+0.018
 fisher+0.018
 shrimp+0.018
Neg logit contributions
chair-0.017
appointed-0.017
unct-0.015
 halls-0.015
irds-0.014
Token and
Token ID290
Feature activation-7.21e-03

Pos logit contributions
 Wally+0.000
 tuna+0.000
 shrimp+0.000
 Sebastian+0.000
 fisher+0.000
Neg logit contributions
appointed-0.001
chair-0.001
unct-0.001
irds-0.001
ring-0.001
Token p
Token ID279
Feature activation-2.83e-02

Pos logit contributions
appointed+0.013
chair+0.013
unct+0.012
irds+0.012
 halls+0.012
Neg logit contributions
 Wally-0.011
 tuna-0.010
 shrimp-0.010
 Sebastian-0.010
 fisher-0.010
Tokenatted
Token ID16898
Feature activation-2.64e-02

Pos logit contributions
appointed+0.047
chair+0.046
 ceiling+0.044
 halls+0.042
 roof+0.042
Neg logit contributions
 Wally-0.044
 Sebastian-0.042
 shrimp-0.041
 tuna-0.041
 Finn-0.040
Token the
Token ID262
Feature activation-6.76e-02

Pos logit contributions
appointed+0.056
chair+0.053
irds+0.053
unct+0.052
 halls+0.052
Neg logit contributions
 Wally-0.032
 tuna-0.029
 Sebastian-0.028
 shrimp-0.027
 Finn-0.026
Token pony
Token ID26902
Feature activation-7.79e-02

Pos logit contributions
irds+0.134
unct+0.128
clock+0.126
owed+0.125
azes+0.124
Neg logit contributions
 Wally-0.085
 fisherman-0.083
 tuna-0.080
 shrimp-0.078
 fish-0.075
Token's
Token ID338
Feature activation-6.19e-02

Pos logit contributions
appointed+0.186
irds+0.183
amas+0.179
 halls+0.177
EEP+0.177
Neg logit contributions
 tuna-0.056
 Wally-0.055
 like-0.051
 named-0.051
 and-0.048
Token back
Token ID736
Feature activation-4.60e-02

Pos logit contributions
cases+0.112
appointed+0.106
chair+0.106
flash+0.105
irds+0.104
Neg logit contributions
 tuna-0.083
 Wally-0.083
 shrimp-0.077
 fish-0.076
 fisherman-0.075
Token.
Token ID13
Feature activation-3.77e-02

Pos logit contributions
appointed+0.108
chair+0.103
unct+0.102
irds+0.101
 halls+0.101
Neg logit contributions
 Wally-0.036
 tuna-0.034
 shrimp-0.032
 fisher-0.029
 Sebastian-0.027
Token They
Token ID1119
Feature activation-1.81e-02

Pos logit contributions
unct+0.077
iece+0.073
irds+0.073
appointed+0.072
ring+0.070
Neg logit contributions
 Wally-0.050
 tuna-0.044
 Sebastian-0.042
 Otto-0.042
 Finn-0.041
Token became
Token ID2627
Feature activation+1.81e-02

Pos logit contributions
appointed+0.033
chair+0.032
amas+0.030
irds+0.030
unct+0.029
Neg logit contributions
 Wally-0.027
 tuna-0.025
 shrimp-0.025
 Sebastian-0.024
 fisher-0.024
Token,
Token ID11
Feature activation-6.88e-02

Pos logit contributions
appointed+0.095
amas+0.092
unct+0.090
rolling+0.087
chair+0.087
Neg logit contributions
 Wally-0.025
 tuna-0.023
 shrimp-0.020
 named-0.019
 Finn-0.019
Token after
Token ID706
Feature activation-3.78e-02

Pos logit contributions
appointed+0.146
unct+0.142
rolling+0.136
rell+0.134
amas+0.133
Neg logit contributions
 Wally-0.078
 like-0.075
 tuna-0.074
 named-0.072
 who-0.072
Token sailing
Token ID28499
Feature activation-1.70e-02

Pos logit contributions
appointed+0.078
chair+0.073
amas+0.070
ring+0.069
unct+0.069
Neg logit contributions
 tuna-0.051
 Wally-0.049
 shrimp-0.049
 Sebastian-0.046
 Fin-0.044
Token far
Token ID1290
Feature activation+2.06e-02

Pos logit contributions
appointed+0.036
chair+0.035
unct+0.033
irds+0.033
amas+0.032
Neg logit contributions
 Wally-0.020
 tuna-0.019
 shrimp-0.018
 Sebastian-0.018
 fisher-0.017
Token away
Token ID1497
Feature activation-5.08e-02

Pos logit contributions
 tuna+0.031
 Wally+0.030
 fisher+0.030
 shrimp+0.029
 Sebastian+0.029
Neg logit contributions
appointed-0.036
chair-0.036
unct-0.032
 halls-0.031
ring-0.030
Token and
Token ID290
Feature activation-8.96e-02

Pos logit contributions
appointed+0.132
unct+0.125
flash+0.122
amas+0.122
irds+0.122
Neg logit contributions
 Wally-0.034
 shrimp-0.031
 tuna-0.030
 like-0.025
 Sebastian-0.024
Token getting
Token ID1972
Feature activation-2.94e-03

Pos logit contributions
amas+0.188
irds+0.185
rolling+0.178
appointed+0.169
ugs+0.167
Neg logit contributions
 like-0.101
 Wally-0.092
 also-0.084
 his-0.082
 tuna-0.079
Token lost
Token ID2626
Feature activation+1.69e-02

Pos logit contributions
appointed+0.005
chair+0.004
unct+0.004
irds+0.004
 halls+0.004
Neg logit contributions
 Wally-0.005
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token,
Token ID11
Feature activation-5.70e-02

Pos logit contributions
 Wally+0.027
 fisher+0.026
 tuna+0.026
 Sebastian+0.026
 shrimp+0.025
Neg logit contributions
appointed-0.027
chair-0.027
 halls-0.024
unct-0.024
 ceiling-0.024
Token it
Token ID340
Feature activation-2.40e-02

Pos logit contributions
appointed+0.106
unct+0.105
irds+0.101
ring+0.100
rolling+0.098
Neg logit contributions
 Wally-0.080
 tuna-0.075
 shrimp-0.071
 like-0.069
 named-0.066
Token started
Token ID2067
Feature activation-2.16e-02

Pos logit contributions
appointed+0.039
chair+0.038
irds+0.037
unct+0.035
bows+0.035
Neg logit contributions
 Wally-0.039
 tuna-0.038
 shrimp-0.036
 Finn-0.035
 Sebastian-0.035