TOP ACTIVATIONS
MAX = 5.530

the sky fair ies spinning their magical club ."
ies spinning their magical club ." Sarah
in a world of make - bel ieve .
. It will make your ear better . Do you want
horn and made v room - v room noises .
The bright colors made Ab ig ail so happy .
sun would make the sc ooter too hot to touch .
light club spinning in the sky !"
day driving and making the world pretty . And from that
shining brightly , making the world look so beautiful . She
his red cape and make - bel ieve he had super
the little girl and her arm chair were the best of
when he was playing make - bel ieve at home ,
a light club spinning in the sky
way . This made the dress look different , with a
blanket before it was torn S ue decided
from home and eventually stumbled upon a weird stage . On
song and make the toy chain dance in the air .
louder . Suddenly , the ringing stopped and a funny voice
I see a light club spinning in

MIN ACTIVATIONS
MIN = -4.237

is shrimp . It is a small fish that lives in
This is shrimp . It is a small fish that lives
juice . Sam wanted a pizza and a milk . Mom
They had a lot of choices . They were not bored
. He thinks it is a cool gun .
lunch . One had a tuna sandwich and the other an
. This shrimp is not a food . It is a
wanted to look for a sport he could do . Then
" A mineral is a kind of rock that comes
These are peanuts . They are interesting because they grow underground
basket , Jack found delicious sandwiches , apples , and oranges
regular pizza . A regular pizza has cheese and sauce and
and look for bugs . Ben finds a big worm and
, " It 's a tuna , and we can catch
to play with toys . Ben has a new toy .
ily offered Tim my some food , but he said no
It has a lot of cheese and sauce and bread .
, so he looked for food . He found a shell
Tim wanted to find some food , so he hopped around
. He found some plastic containers . He thought they looked

INTERVAL 3.088 - 5.530
CONTAINS 0.134%

, she would sing a happy tune . One day she
They played hide - and - seek and jumped in p
so sad but then she heard a lovely sound . It
a song coming from outside . It was a beautiful ,
you frightened ? â € Jane said ,

INTERVAL 0.646 - 3.088
CONTAINS 23.730%

looked so pretty and felt so confident in her black dress
said it was time for bed . Lily got ready for
mom warned her not to touch the shelf , but she
Tim my 's room and take the key .
lived in a big castle . He was very tall and

INTERVAL -1.795 - 0.646
CONTAINS 74.179%

with princess es and fair ies . One day
Lily say . " Yes , I have .
starting to get really frustrated . He started to wonder if
Once upon a time , there lived a mighty dragon .
new flag . Every day she made sure to

INTERVAL -4.237 - -1.795
CONTAINS 1.958%

is very old and has a stitch , but he is
, but her mom my was persistent and made sure she
friend too . You always share with me ."
needed a special item to finish their collection . L ola
to eat . She thinks tuna is the best food in
Token the
Token ID262
Feature activation+1.38e-01

Pos logit contributions
appointed+0.016
chair+0.016
unct+0.014
irds+0.014
 halls+0.014
Neg logit contributions
 Wally-0.012
 tuna-0.012
 Sebastian-0.012
 shrimp-0.012
 fisher-0.011
Token sky
Token ID6766
Feature activation+2.23e+00

Pos logit contributions
appointed+0.005
chair+0.005
irds+0.004
unct+0.004
 halls+0.004
Neg logit contributions
 Wally-0.006
 tuna-0.006
 Sebastian-0.006
 shrimp-0.006
 fisher-0.006
Token fair
Token ID3148
Feature activation+1.54e+00

Pos logit contributions
appointed+0.117
chair+0.110
unct+0.109
irds+0.106
 halls+0.105
Neg logit contributions
 tuna-0.059
 Wally-0.058
 shrimp-0.056
 Sebastian-0.052
 like-0.049
Tokenies
Token ID444
Feature activation+5.84e-01

Pos logit contributions
appointed+0.087
chair+0.081
unct+0.080
flash+0.078
amas+0.076
Neg logit contributions
 tuna-0.037
 Wally-0.037
 shrimp-0.037
 Sebastian-0.033
 fisher-0.033
Token spinning
Token ID19493
Feature activation+8.77e-01

Pos logit contributions
appointed+0.029
chair+0.027
unct+0.026
irds+0.026
 halls+0.025
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.017
 Sebastian-0.017
 fisher-0.016
Token
Token ID220
Feature activation+5.53e+00

Pos logit contributions
appointed+0.042
chair+0.040
unct+0.038
irds+0.037
 halls+0.036
Neg logit contributions
 Wally-0.029
 tuna-0.027
 shrimp-0.027
 Sebastian-0.026
 fisher-0.025
Token their
Token ID511
Feature activation+1.52e+00

Pos logit contributions
unct+0.216
appointed+0.193
 floors+0.192
 forehead+0.192
 halls+0.191
Neg logit contributions
 tuna-0.171
 Wally-0.171
inn-0.168
 Finn-0.165
 Sebastian-0.158
Token magical
Token ID10883
Feature activation+5.65e-01

Pos logit contributions
appointed+0.061
chair+0.058
unct+0.053
irds+0.050
rays+0.050
Neg logit contributions
 shrimp-0.062
 Wally-0.062
 tuna-0.061
 Sebastian-0.056
 fisher-0.056
Token
Token ID220
Feature activation+5.43e+00

Pos logit contributions
appointed+0.020
chair+0.019
unct+0.018
irds+0.017
flash+0.017
Neg logit contributions
 Wally-0.025
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.023
Token club
Token ID3430
Feature activation+2.42e+00

Pos logit contributions
unct+0.206
chair+0.186
 halls+0.185
appointed+0.182
 floors+0.179
Neg logit contributions
inn-0.170
 Wally-0.170
 Finn-0.166
 tuna-0.166
 Sebastian-0.164
Token."
Token ID22135
Feature activation+1.50e+00

Pos logit contributions
appointed+0.127
unct+0.119
chair+0.114
irds+0.113
flash+0.113
Neg logit contributions
 Wally-0.064
 tuna-0.061
 shrimp-0.058
 fisher-0.054
 named-0.052
Tokenies
Token ID444
Feature activation+5.84e-01

Pos logit contributions
appointed+0.087
chair+0.081
unct+0.080
flash+0.078
amas+0.076
Neg logit contributions
 tuna-0.037
 Wally-0.037
 shrimp-0.037
 Sebastian-0.033
 fisher-0.033
Token spinning
Token ID19493
Feature activation+8.77e-01

Pos logit contributions
appointed+0.029
chair+0.027
unct+0.026
irds+0.026
 halls+0.025
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.017
 Sebastian-0.017
 fisher-0.016
Token
Token ID220
Feature activation+5.53e+00

Pos logit contributions
appointed+0.042
chair+0.040
unct+0.038
irds+0.037
 halls+0.036
Neg logit contributions
 Wally-0.029
 tuna-0.027
 shrimp-0.027
 Sebastian-0.026
 fisher-0.025
Token their
Token ID511
Feature activation+1.52e+00

Pos logit contributions
unct+0.216
appointed+0.193
 floors+0.192
 forehead+0.192
 halls+0.191
Neg logit contributions
 tuna-0.171
 Wally-0.171
inn-0.168
 Finn-0.165
 Sebastian-0.158
Token magical
Token ID10883
Feature activation+5.65e-01

Pos logit contributions
appointed+0.061
chair+0.058
unct+0.053
irds+0.050
rays+0.050
Neg logit contributions
 shrimp-0.062
 Wally-0.062
 tuna-0.061
 Sebastian-0.056
 fisher-0.056
Token
Token ID220
Feature activation+5.43e+00

Pos logit contributions
appointed+0.020
chair+0.019
unct+0.018
irds+0.017
flash+0.017
Neg logit contributions
 Wally-0.025
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.023
Token club
Token ID3430
Feature activation+2.42e+00

Pos logit contributions
unct+0.206
chair+0.186
 halls+0.185
appointed+0.182
 floors+0.179
Neg logit contributions
inn-0.170
 Wally-0.170
 Finn-0.166
 tuna-0.166
 Sebastian-0.164
Token."
Token ID22135
Feature activation+1.50e+00

Pos logit contributions
appointed+0.127
unct+0.119
chair+0.114
irds+0.113
flash+0.113
Neg logit contributions
 Wally-0.064
 tuna-0.061
 shrimp-0.058
 fisher-0.054
 named-0.052
Token
Token ID198
Feature activation+9.25e-01

Pos logit contributions
appointed+0.075
unct+0.073
chair+0.071
 halls+0.070
irds+0.069
Neg logit contributions
 Wally-0.044
 tuna-0.040
 Sebastian-0.039
 named-0.035
 shrimp-0.035
Token
Token ID198
Feature activation+9.55e-01

Pos logit contributions
appointed+0.053
chair+0.051
unct+0.050
ring+0.049
irds+0.048
Neg logit contributions
 Wally-0.021
 tuna-0.019
 Sebastian-0.019
 shrimp-0.018
 Finn-0.017
TokenSarah
Token ID29284
Feature activation+9.95e-01

Pos logit contributions
appointed+0.039
chair+0.038
unct+0.038
ring+0.036
irds+0.036
Neg logit contributions
 Wally-0.035
 tuna-0.032
 Sebastian-0.032
 shrimp-0.031
 fisher-0.030
Token in
Token ID287
Feature activation-2.84e-01

Pos logit contributions
appointed+0.121
irds+0.110
unct+0.108
chair+0.108
EEP+0.107
Neg logit contributions
 tuna-0.058
 Wally-0.057
 shrimp-0.053
 fossil-0.048
 Sebastian-0.047
Token a
Token ID257
Feature activation+5.46e-01

Pos logit contributions
 Wally+0.011
 Sebastian+0.011
 fisher+0.011
 tuna+0.011
 shrimp+0.011
Neg logit contributions
appointed-0.011
chair-0.011
 halls-0.010
unct-0.010
clock-0.010
Token world
Token ID995
Feature activation+1.27e+00

Pos logit contributions
appointed+0.022
chair+0.020
irds+0.019
unct+0.019
ring+0.018
Neg logit contributions
 Wally-0.023
 tuna-0.023
 shrimp-0.022
 Sebastian-0.021
 fisher-0.021
Token of
Token ID286
Feature activation+1.54e-01

Pos logit contributions
appointed+0.069
unct+0.064
irds+0.064
amas+0.061
flash+0.060
Neg logit contributions
 Wally-0.034
 tuna-0.033
 shrimp-0.031
 Sebastian-0.029
 fisher-0.027
Token make
Token ID787
Feature activation+3.69e+00

Pos logit contributions
appointed+0.006
chair+0.005
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.007
 tuna-0.006
 Sebastian-0.006
 shrimp-0.006
 fisher-0.006
Token-
Token ID12
Feature activation+5.32e+00

Pos logit contributions
appointed+0.208
EEP+0.182
unct+0.182
bars+0.181
chair+0.179
Neg logit contributions
 tuna-0.111
 Wally-0.100
 shrimp-0.099
 fish-0.092
 his-0.088
Tokenbel
Token ID6667
Feature activation+1.91e+00

Pos logit contributions
amas+0.217
appointed+0.209
oga+0.196
EEP+0.189
 ceiling+0.177
Neg logit contributions
 who-0.197
 like-0.184
 fish-0.179
 shrimp-0.177
 named-0.172
Tokenieve
Token ID12311
Feature activation+9.25e-01

Pos logit contributions
appointed+0.088
chair+0.083
irds+0.083
EEP+0.079
 roof+0.077
Neg logit contributions
 tuna-0.057
 Wally-0.056
 fisher-0.055
 Sebastian-0.052
 shrimp-0.051
Token.
Token ID13
Feature activation+2.38e+00

Pos logit contributions
appointed+0.043
chair+0.039
unct+0.038
irds+0.037
flash+0.036
Neg logit contributions
 Wally-0.032
 tuna-0.032
 shrimp-0.031
 fisher-0.030
 Sebastian-0.029
Token
Token ID198
Feature activation+5.55e-01

Pos logit contributions
unct+0.111
appointed+0.106
arse+0.104
ong+0.101
irds+0.099
Neg logit contributions
 Wally-0.086
 tuna-0.077
 named-0.074
 Nem-0.072
 Pete-0.071
Token
Token ID198
Feature activation+1.69e+00

Pos logit contributions
appointed+0.030
chair+0.028
unct+0.027
irds+0.027
ring+0.027
Neg logit contributions
 Wally-0.015
 tuna-0.014
 Sebastian-0.013
 shrimp-0.013
 fisher-0.012
Token.
Token ID13
Feature activation+1.03e+00

Pos logit contributions
appointed+0.011
chair+0.010
unct+0.010
irds+0.010
 halls+0.009
Neg logit contributions
 Wally-0.008
 tuna-0.008
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token It
Token ID632
Feature activation+1.92e+00

Pos logit contributions
appointed+0.043
chair+0.040
unct+0.040
irds+0.038
cases+0.038
Neg logit contributions
 Wally-0.040
 tuna-0.039
 Sebastian-0.037
 shrimp-0.036
 fisher-0.036
Token will
Token ID481
Feature activation+7.70e-01

Pos logit contributions
appointed+0.097
irds+0.091
iece+0.088
rolling+0.087
unct+0.086
Neg logit contributions
 tuna-0.059
 Wally-0.056
 shrimp-0.049
 Finn-0.048
 Fin-0.048
Token make
Token ID787
Feature activation+2.86e+00

Pos logit contributions
appointed+0.032
chair+0.031
irds+0.029
unct+0.028
 halls+0.027
Neg logit contributions
 Wally-0.030
 tuna-0.029
 Sebastian-0.028
 shrimp-0.028
 fisher-0.027
Token your
Token ID534
Feature activation+2.50e+00

Pos logit contributions
appointed+0.165
irds+0.157
rolling+0.152
bows+0.151
chair+0.149
Neg logit contributions
 tuna-0.080
 Wally-0.067
 his-0.060
 shrimp-0.059
 Finn-0.057
Token ear
Token ID1027
Feature activation+5.24e+00

Pos logit contributions
appointed+0.107
irds+0.099
cakes+0.094
cases+0.093
filled+0.092
Neg logit contributions
 tuna-0.100
 Wally-0.086
 fish-0.085
 shrimp-0.085
 shark-0.084
Token better
Token ID1365
Feature activation+1.27e+00

Pos logit contributions
appointed+0.246
umm+0.235
ol+0.227
 halls+0.224
umbled+0.223
Neg logit contributions
 like-0.156
 who-0.144
 and-0.143
 also-0.130
 or-0.130
Token.
Token ID13
Feature activation+3.60e-01

Pos logit contributions
appointed+0.070
chair+0.064
unct+0.063
irds+0.062
flash+0.062
Neg logit contributions
 tuna-0.033
 Wally-0.031
 shrimp-0.030
 fisher-0.029
 Sebastian-0.028
Token Do
Token ID2141
Feature activation+1.55e+00

Pos logit contributions
appointed+0.014
chair+0.014
unct+0.013
irds+0.012
 halls+0.012
Neg logit contributions
 Wally-0.015
 tuna-0.014
 Sebastian-0.014
 shrimp-0.014
 fisher-0.013
Token you
Token ID345
Feature activation+1.03e-01

Pos logit contributions
appointed+0.078
chair+0.072
ring+0.071
rolling+0.070
amas+0.069
Neg logit contributions
 Wally-0.054
 tuna-0.051
 shrimp-0.047
 Fin-0.045
 Finn-0.044
Token want
Token ID765
Feature activation+7.56e-01

Pos logit contributions
appointed+0.004
chair+0.004
unct+0.004
irds+0.004
 halls+0.004
Neg logit contributions
 Wally-0.004
 tuna-0.004
 Sebastian-0.004
 shrimp-0.004
 fisher-0.004
Token horn
Token ID12718
Feature activation+1.80e+00

Pos logit contributions
appointed+0.101
irds+0.091
ics+0.088
cases+0.087
rolling+0.086
Neg logit contributions
 tuna-0.081
 fish-0.077
 fossil-0.073
 tank-0.073
 shark-0.071
Token and
Token ID290
Feature activation+2.71e-01

Pos logit contributions
appointed+0.125
unct+0.117
irds+0.115
 halls+0.113
rolling+0.112
Neg logit contributions
 tuna-0.020
 Wally-0.018
 fisher-0.014
 shrimp-0.014
 fish-0.013
Token made
Token ID925
Feature activation+3.20e+00

Pos logit contributions
appointed+0.013
chair+0.012
irds+0.011
unct+0.011
 halls+0.011
Neg logit contributions
 Wally-0.009
 tuna-0.009
 Sebastian-0.009
 shrimp-0.009
 fisher-0.008
Token v
Token ID410
Feature activation+2.45e+00

Pos logit contributions
appointed+0.205
unct+0.182
rolling+0.179
amas+0.179
chair+0.177
Neg logit contributions
 tuna-0.069
 his-0.068
 fish-0.067
 Finn-0.061
 shrimp-0.059
Tokenroom
Token ID3823
Feature activation+3.51e+00

Pos logit contributions
appointed+0.118
unct+0.105
chair+0.103
rated+0.096
rolling+0.096
Neg logit contributions
 fisher-0.073
 chefs-0.069
 tuna-0.066
 Finn-0.065
 Sebastian-0.064
Token-
Token ID12
Feature activation+5.19e+00

Pos logit contributions
appointed+0.152
unct+0.127
irds+0.127
oga+0.126
 doorstep+0.124
Neg logit contributions
 tuna-0.135
 fish-0.132
 his-0.123
 and-0.119
 like-0.117
Tokenv
Token ID85
Feature activation+1.50e+00

Pos logit contributions
appointed+0.261
 doorstep+0.250
amas+0.248
oga+0.233
rolling+0.227
Neg logit contributions
 who-0.135
 like-0.128
 fish-0.121
He-0.106
 his-0.103
Tokenroom
Token ID3823
Feature activation+3.53e+00

Pos logit contributions
appointed+0.060
unct+0.054
chair+0.053
irds+0.048
rolling+0.048
Neg logit contributions
 tuna-0.058
 Sebastian-0.057
 fisher-0.057
 Wally-0.057
 Finn-0.056
Token noises
Token ID26782
Feature activation+1.95e-01

Pos logit contributions
appointed+0.147
unct+0.125
irds+0.123
oga+0.120
chair+0.118
Neg logit contributions
 tuna-0.141
 fish-0.137
 his-0.126
 and-0.125
 like-0.123
Token.
Token ID13
Feature activation+6.13e-01

Pos logit contributions
appointed+0.010
chair+0.009
unct+0.009
irds+0.009
 halls+0.008
Neg logit contributions
 Wally-0.006
 tuna-0.006
 Sebastian-0.006
 shrimp-0.006
 fisher-0.005
Token
Token ID198
Feature activation-4.91e-01

Pos logit contributions
appointed+0.030
chair+0.029
unct+0.028
irds+0.027
ring+0.026
Neg logit contributions
 Wally-0.019
 tuna-0.018
 Sebastian-0.018
 shrimp-0.017
 fisher-0.017
Token The
Token ID383
Feature activation+1.97e+00

Pos logit contributions
chair+0.083
unct+0.083
vation+0.081
appointed+0.079
iece+0.079
Neg logit contributions
 Wally-0.075
 Sebastian-0.068
 shrimp-0.068
 Fin-0.068
 Finn-0.068
Token bright
Token ID6016
Feature activation+1.93e+00

Pos logit contributions
cases+0.081
chair+0.080
appointed+0.080
amas+0.074
unct+0.073
Neg logit contributions
 shrimp-0.081
 fish-0.076
 Wally-0.074
 tuna-0.074
 fisherman-0.070
Token colors
Token ID7577
Feature activation+5.56e-01

Pos logit contributions
appointed+0.078
cases+0.076
unct+0.073
seek+0.072
amas+0.072
Neg logit contributions
 shrimp-0.079
 fish-0.075
 tuna-0.073
 Wally-0.071
 named-0.069
Token made
Token ID925
Feature activation+2.28e+00

Pos logit contributions
appointed+0.025
chair+0.023
unct+0.022
irds+0.022
 halls+0.021
Neg logit contributions
 Wally-0.020
 tuna-0.019
 shrimp-0.019
 Sebastian-0.019
 fisher-0.018
Token Ab
Token ID2275
Feature activation+2.88e+00

Pos logit contributions
chair+0.133
appointed+0.127
bars+0.125
cases+0.122
unct+0.121
Neg logit contributions
 Wally-0.058
 shrimp-0.056
 Nem-0.053
 tuna-0.053
 Sebastian-0.053
Tokenig
Token ID328
Feature activation+5.12e+00

Pos logit contributions
appointed+0.150
chair+0.146
cases+0.138
case+0.136
 halls+0.134
Neg logit contributions
 Finn-0.069
 fisher-0.067
 tuna-0.067
 Sebastian-0.065
 named-0.065
Tokenail
Token ID603
Feature activation+4.33e+00

Pos logit contributions
chair+0.227
clock+0.214
appointed+0.209
umm+0.203
flash+0.191
Neg logit contributions
 and-0.167
 sw-0.156
 swimming-0.154
 fish-0.152
 spl-0.152
Token so
Token ID523
Feature activation+2.25e+00

Pos logit contributions
appointed+0.219
amas+0.214
aries+0.213
unct+0.207
EEP+0.205
Neg logit contributions
 like-0.115
 fish-0.106
 and-0.106
 tuna-0.105
 shrimp-0.097
Token happy
Token ID3772
Feature activation+4.48e-01

Pos logit contributions
amas+0.139
aries+0.136
rays+0.135
appointed+0.133
oga+0.132
Neg logit contributions
 Wally-0.045
 Finn-0.042
 shrimp-0.041
 fish-0.039
 tuna-0.038
Token.
Token ID13
Feature activation+1.57e+00

Pos logit contributions
appointed+0.023
chair+0.022
unct+0.021
irds+0.021
flash+0.020
Neg logit contributions
 Wally-0.013
 tuna-0.013
 shrimp-0.012
 Sebastian-0.012
 fisher-0.012
Token
Token ID198
Feature activation-8.91e-02

Pos logit contributions
chair+0.062
unct+0.060
appointed+0.060
cases+0.056
vation+0.056
Neg logit contributions
 Wally-0.064
 Sebastian-0.060
 Finn-0.058
 Fin-0.058
 shrimp-0.058
Token sun
Token ID4252
Feature activation+1.73e+00

Pos logit contributions
irds+0.064
appointed+0.062
owed+0.060
chair+0.059
cases+0.058
Neg logit contributions
 Sebastian-0.059
 tuna-0.058
 Wally-0.058
 shrimp-0.057
 fisher-0.055
Token would
Token ID561
Feature activation+8.38e-01

Pos logit contributions
appointed+0.092
unct+0.087
irds+0.086
 halls+0.082
chair+0.082
Neg logit contributions
 Wally-0.041
 tuna-0.040
 Sebastian-0.038
 named-0.037
 shrimp-0.037
Token make
Token ID787
Feature activation+4.24e+00

Pos logit contributions
appointed+0.030
chair+0.029
irds+0.028
unct+0.026
 halls+0.026
Neg logit contributions
 Wally-0.036
 tuna-0.035
 Sebastian-0.034
 shrimp-0.034
 fisher-0.032
Token the
Token ID262
Feature activation+1.98e+00

Pos logit contributions
appointed+0.287
irds+0.283
seek+0.274
cases+0.272
rolling+0.268
Neg logit contributions
 them-0.075
 their-0.067
 his-0.062
 it-0.050
 tuna-0.041
Token sc
Token ID629
Feature activation+3.72e+00

Pos logit contributions
appointed+0.079
irds+0.079
cases+0.075
chair+0.075
owed+0.074
Neg logit contributions
 shrimp-0.074
 Sebastian-0.073
 tuna-0.073
 fish-0.072
 Wally-0.069
Tokenooter
Token ID25141
Feature activation+5.10e+00

Pos logit contributions
 ceiling+0.120
appointed+0.119
amas+0.109
 endings+0.108
light+0.107
Neg logit contributions
 named-0.152
 fisher-0.139
 Sebastian-0.138
 fishing-0.134
 Fin-0.132
Token too
Token ID1165
Feature activation+1.67e+00

Pos logit contributions
irds+0.257
appointed+0.247
olk+0.245
umm+0.237
amas+0.234
Neg logit contributions
 like-0.151
 his-0.125
 and-0.122
 or-0.109
 too-0.107
Token hot
Token ID3024
Feature activation-3.76e-01

Pos logit contributions
appointed+0.085
irds+0.084
iece+0.082
chair+0.082
unct+0.082
Neg logit contributions
 tuna-0.043
 Wally-0.042
 shrimp-0.041
 Sebastian-0.039
 fisher-0.037
Token to
Token ID284
Feature activation+1.50e+00

Pos logit contributions
 Wally+0.014
 tuna+0.014
 Sebastian+0.014
 fisher+0.013
 shrimp+0.013
Neg logit contributions
appointed-0.016
chair-0.016
ring-0.014
unct-0.014
clock-0.013
Token touch
Token ID3638
Feature activation+1.74e+00

Pos logit contributions
appointed+0.060
chair+0.056
irds+0.055
cases+0.053
clock+0.052
Neg logit contributions
 Wally-0.059
 Sebastian-0.056
 tuna-0.056
 shrimp-0.054
 Otto-0.053
Token.
Token ID13
Feature activation+1.54e+00

Pos logit contributions
appointed+0.101
irds+0.094
unct+0.093
rolling+0.092
chair+0.091
Neg logit contributions
 Wally-0.042
 tuna-0.040
 shrimp-0.037
 Sebastian-0.036
 Finn-0.033
Token light
Token ID1657
Feature activation+7.06e-01

Pos logit contributions
unct+0.177
appointed+0.174
 cheeks+0.172
erved+0.165
 forehead+0.163
Neg logit contributions
 Wally-0.144
 Sebastian-0.142
 tuna-0.140
 shrimp-0.139
 fisher-0.136
Token
Token ID220
Feature activation+4.83e+00

Pos logit contributions
appointed+0.030
chair+0.027
unct+0.026
irds+0.026
 halls+0.025
Neg logit contributions
 Wally-0.027
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.025
Token club
Token ID3430
Feature activation+6.41e-01

Pos logit contributions
unct+0.180
appointed+0.177
 cheeks+0.175
erved+0.170
 halls+0.167
Neg logit contributions
 Wally-0.147
 tuna-0.147
 Sebastian-0.144
inn-0.143
 shrimp-0.141
Token
Token ID220
Feature activation+4.95e+00

Pos logit contributions
appointed+0.031
chair+0.029
unct+0.028
irds+0.028
 halls+0.027
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.019
 Sebastian-0.019
 fisher-0.019
Token spinning
Token ID19493
Feature activation+7.59e-01

Pos logit contributions
unct+0.185
 cheeks+0.180
appointed+0.179
 roof+0.174
 halls+0.174
Neg logit contributions
 Wally-0.153
inn-0.153
 tuna-0.151
 Sebastian-0.147
 Finn-0.144
Token
Token ID220
Feature activation+5.08e+00

Pos logit contributions
appointed+0.036
chair+0.034
unct+0.033
irds+0.032
 halls+0.031
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token in
Token ID287
Feature activation+1.40e-01

Pos logit contributions
unct+0.193
appointed+0.184
 cheeks+0.181
 halls+0.181
 roof+0.180
Neg logit contributions
 Wally-0.159
 tuna-0.157
inn-0.157
 Sebastian-0.149
 Finn-0.149
Token the
Token ID262
Feature activation+1.56e+00

Pos logit contributions
appointed+0.007
chair+0.006
unct+0.006
irds+0.006
 halls+0.006
Neg logit contributions
 Wally-0.005
 tuna-0.005
 Sebastian-0.004
 shrimp-0.004
 fisher-0.004
Token sky
Token ID6766
Feature activation+1.87e+00

Pos logit contributions
appointed+0.051
irds+0.048
unct+0.047
chair+0.045
cases+0.044
Neg logit contributions
 tuna-0.076
 Wally-0.075
 shrimp-0.075
 Sebastian-0.068
 named-0.068
Token!"
Token ID2474
Feature activation+2.14e+00

Pos logit contributions
appointed+0.099
chair+0.096
unct+0.094
irds+0.090
 halls+0.090
Neg logit contributions
 Wally-0.048
 tuna-0.047
 shrimp-0.045
 Sebastian-0.042
 fisher-0.042
Token
Token ID198
Feature activation+9.99e-01

Pos logit contributions
unct+0.114
appointed+0.109
chair+0.106
flash+0.105
irds+0.105
Neg logit contributions
 named-0.051
 Wally-0.049
 Sebastian-0.047
 tuna-0.040
 shrimp-0.039
Token day
Token ID1110
Feature activation+1.04e+00

Pos logit contributions
 Wally+0.024
 Sebastian+0.024
 tuna+0.022
 shrimp+0.022
 fisher+0.021
Neg logit contributions
appointed-0.028
chair-0.028
 halls-0.026
bows-0.025
clock-0.025
Token driving
Token ID5059
Feature activation+1.53e+00

Pos logit contributions
appointed+0.055
chair+0.051
irds+0.049
unct+0.049
flash+0.048
Neg logit contributions
 Wally-0.031
 tuna-0.029
 shrimp-0.028
 Sebastian-0.028
 fisher-0.026
Token and
Token ID290
Feature activation+1.73e+00

Pos logit contributions
appointed+0.074
unct+0.068
chair+0.067
irds+0.066
flash+0.065
Neg logit contributions
 Wally-0.050
 tuna-0.046
 Sebastian-0.045
 shrimp-0.044
 Finn-0.042
Token making
Token ID1642
Feature activation+2.72e+00

Pos logit contributions
appointed+0.085
irds+0.079
chair+0.079
amas+0.076
flash+0.074
Neg logit contributions
 Wally-0.055
 tuna-0.052
 Sebastian-0.050
 shrimp-0.048
 Finn-0.047
Token the
Token ID262
Feature activation+1.83e+00

Pos logit contributions
appointed+0.145
irds+0.130
chair+0.130
unct+0.126
ring+0.126
Neg logit contributions
 tuna-0.085
 Wally-0.083
 shrimp-0.078
 Sebastian-0.076
 Finn-0.074
Token world
Token ID995
Feature activation+5.05e+00

Pos logit contributions
appointed+0.072
chair+0.065
irds+0.063
ring+0.061
unct+0.061
Neg logit contributions
 tuna-0.078
 Wally-0.076
 shrimp-0.076
 Sebastian-0.072
 fisher-0.072
Token pretty
Token ID2495
Feature activation+1.78e+00

Pos logit contributions
unct+0.257
appointed+0.248
irds+0.240
amas+0.237
olk+0.234
Neg logit contributions
 like-0.151
 and-0.125
 their-0.112
 his-0.109
 tuna-0.107
Token.
Token ID13
Feature activation+5.94e-01

Pos logit contributions
appointed+0.098
unct+0.088
irds+0.085
rolling+0.084
chair+0.083
Neg logit contributions
 tuna-0.052
 shrimp-0.051
 Wally-0.047
 fish-0.045
 Sebastian-0.044
Token And
Token ID843
Feature activation-3.08e-01

Pos logit contributions
appointed+0.026
chair+0.024
unct+0.023
irds+0.023
 halls+0.022
Neg logit contributions
 Wally-0.022
 tuna-0.021
 Sebastian-0.021
 shrimp-0.021
 fisher-0.020
Token from
Token ID422
Feature activation-5.99e-01

Pos logit contributions
 Wally+0.011
 tuna+0.010
 shrimp+0.010
 Sebastian+0.010
 fisher+0.010
Neg logit contributions
appointed-0.014
chair-0.013
irds-0.012
unct-0.012
 halls-0.012
Token that
Token ID326
Feature activation+4.90e-01

Pos logit contributions
 Wally+0.019
 tuna+0.018
 fisher+0.018
 Sebastian+0.018
 shrimp+0.018
Neg logit contributions
appointed-0.028
chair-0.028
unct-0.026
 halls-0.025
filled-0.024
Token shining
Token ID22751
Feature activation+1.96e+00

Pos logit contributions
appointed+0.056
chair+0.056
irds+0.055
unct+0.053
cases+0.050
Neg logit contributions
 tuna-0.045
 Wally-0.045
 shrimp-0.044
 Sebastian-0.043
 named-0.039
Token brightly
Token ID35254
Feature activation+6.93e-01

Pos logit contributions
appointed+0.119
irds+0.118
unct+0.116
chair+0.113
seek+0.110
Neg logit contributions
 tuna-0.035
 shrimp-0.032
 Wally-0.032
 Sebastian-0.029
 like-0.027
Token,
Token ID11
Feature activation+8.43e-01

Pos logit contributions
appointed+0.038
chair+0.037
irds+0.035
unct+0.035
 halls+0.034
Neg logit contributions
 Wally-0.017
 tuna-0.017
 shrimp-0.016
 Sebastian-0.016
 fisher-0.015
Token making
Token ID1642
Feature activation+3.89e+00

Pos logit contributions
appointed+0.051
chair+0.048
irds+0.048
unct+0.047
flash+0.045
Neg logit contributions
 Wally-0.017
 tuna-0.016
 shrimp-0.015
 Sebastian-0.015
 fisher-0.013
Token the
Token ID262
Feature activation+2.75e+00

Pos logit contributions
appointed+0.249
unct+0.238
ooter+0.233
irds+0.233
bars+0.232
Neg logit contributions
 tuna-0.078
 shrimp-0.069
 their-0.068
 them-0.065
 Wally-0.056
Token world
Token ID995
Feature activation+5.01e+00

Pos logit contributions
appointed+0.115
irds+0.112
ending+0.102
unct+0.100
ring+0.100
Neg logit contributions
 tuna-0.109
 shrimp-0.106
 fish-0.100
 Wally-0.096
 shark-0.094
Token look
Token ID804
Feature activation+1.10e+00

Pos logit contributions
unct+0.280
appointed+0.269
irds+0.254
amas+0.248
urious+0.248
Neg logit contributions
 like-0.145
 and-0.125
 tuna-0.103
 their-0.103
 fish-0.097
Token so
Token ID523
Feature activation+3.12e+00

Pos logit contributions
appointed+0.048
chair+0.047
irds+0.045
unct+0.044
oga+0.042
Neg logit contributions
 tuna-0.039
 shrimp-0.039
 Wally-0.038
 Sebastian-0.037
 fisher-0.034
Token beautiful
Token ID4950
Feature activation+2.34e-01

Pos logit contributions
amas+0.175
age+0.175
oga+0.173
seek+0.172
iece+0.172
Neg logit contributions
 shrimp-0.084
 Nem-0.072
 tuna-0.070
 inviting-0.069
 fish-0.064
Token.
Token ID13
Feature activation+2.98e+00

Pos logit contributions
appointed+0.011
chair+0.011
unct+0.010
irds+0.010
 halls+0.010
Neg logit contributions
 Wally-0.008
 tuna-0.007
 shrimp-0.007
 Sebastian-0.007
 fisher-0.007
Token She
Token ID1375
Feature activation+1.25e+00

Pos logit contributions
unct+0.159
vation+0.144
irds+0.142
ring+0.142
ics+0.138
Neg logit contributions
 named-0.074
 Wally-0.071
 Sebastian-0.070
 but-0.063
 Tommy-0.061
Token his
Token ID465
Feature activation+4.44e-01

Pos logit contributions
appointed+0.016
chair+0.015
unct+0.014
irds+0.014
 halls+0.014
Neg logit contributions
 Wally-0.006
 tuna-0.006
 shrimp-0.006
 Sebastian-0.006
 fisher-0.005
Token red
Token ID2266
Feature activation+6.30e-01

Pos logit contributions
appointed+0.017
chair+0.016
irds+0.015
unct+0.015
 halls+0.014
Neg logit contributions
 Wally-0.019
 tuna-0.018
 shrimp-0.018
 Sebastian-0.018
 fisher-0.017
Token cape
Token ID36023
Feature activation+6.79e-01

Pos logit contributions
appointed+0.027
chair+0.025
unct+0.023
irds+0.023
flash+0.022
Neg logit contributions
 tuna-0.024
 Wally-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token and
Token ID290
Feature activation+1.02e+00

Pos logit contributions
appointed+0.037
chair+0.035
unct+0.034
irds+0.034
 halls+0.033
Neg logit contributions
 Wally-0.017
 tuna-0.017
 Sebastian-0.016
 shrimp-0.016
 fisher-0.015
Token make
Token ID787
Feature activation+2.68e+00

Pos logit contributions
appointed+0.051
chair+0.047
irds+0.045
unct+0.044
amas+0.044
Neg logit contributions
 tuna-0.032
 Wally-0.032
 shrimp-0.030
 Sebastian-0.030
 fisher-0.029
Token-
Token ID12
Feature activation+5.01e+00

Pos logit contributions
appointed+0.168
chair+0.150
irds+0.147
unct+0.141
bars+0.141
Neg logit contributions
 tuna-0.063
 his-0.057
 fish-0.054
 Sebastian-0.053
 Wally-0.052
Tokenbel
Token ID6667
Feature activation+1.92e+00

Pos logit contributions
appointed+0.209
amas+0.203
umm+0.185
 ceiling+0.185
 doorstep+0.185
Neg logit contributions
 who-0.171
 fish-0.165
 like-0.161
 named-0.161
 his-0.153
Tokenieve
Token ID12311
Feature activation+1.41e-01

Pos logit contributions
appointed+0.088
chair+0.083
irds+0.081
EEP+0.076
unct+0.075
Neg logit contributions
 tuna-0.062
 Wally-0.059
 fisher-0.058
 Sebastian-0.056
 shrimp-0.055
Token he
Token ID339
Feature activation+1.87e+00

Pos logit contributions
appointed+0.007
chair+0.006
irds+0.006
unct+0.006
 halls+0.006
Neg logit contributions
 Wally-0.005
 tuna-0.004
 Sebastian-0.004
 shrimp-0.004
 fisher-0.004
Token had
Token ID550
Feature activation+6.37e-01

Pos logit contributions
appointed+0.098
irds+0.088
amas+0.087
chair+0.086
unct+0.084
Neg logit contributions
 tuna-0.054
 Wally-0.049
 fisher-0.048
 fossil-0.047
 shrimp-0.047
Token super
Token ID2208
Feature activation+2.69e+00

Pos logit contributions
appointed+0.031
chair+0.029
unct+0.027
irds+0.027
amas+0.026
Neg logit contributions
 Wally-0.021
 tuna-0.021
 shrimp-0.020
 Sebastian-0.020
 fisher-0.019
Token the
Token ID262
Feature activation+2.26e+00

Pos logit contributions
appointed+0.056
chair+0.053
unct+0.050
cases+0.049
 halls+0.049
Neg logit contributions
 Wally-0.034
 tuna-0.032
 shrimp-0.030
 Sebastian-0.030
 Finn-0.029
Token little
Token ID1310
Feature activation+2.04e+00

Pos logit contributions
appointed+0.110
irds+0.110
ics+0.109
ring+0.104
bows+0.103
Neg logit contributions
 tuna-0.072
 shrimp-0.060
 Wally-0.058
 named-0.057
 also-0.055
Token girl
Token ID2576
Feature activation+3.75e-01

Pos logit contributions
appointed+0.098
cases+0.094
ics+0.091
ong+0.090
unct+0.089
Neg logit contributions
 tuna-0.071
 shrimp-0.068
 Wally-0.063
 fish-0.061
 fisherman-0.058
Token and
Token ID290
Feature activation+1.44e+00

Pos logit contributions
appointed+0.018
chair+0.017
irds+0.016
unct+0.016
 halls+0.015
Neg logit contributions
 Wally-0.012
 tuna-0.012
 shrimp-0.011
 Sebastian-0.011
 fisher-0.011
Token her
Token ID607
Feature activation+1.97e+00

Pos logit contributions
appointed+0.072
cases+0.067
irds+0.065
amas+0.064
rolling+0.064
Neg logit contributions
 Wally-0.047
 tuna-0.044
 Sebastian-0.042
 shrimp-0.041
 named-0.040
Token arm
Token ID3211
Feature activation+5.01e+00

Pos logit contributions
appointed+0.112
irds+0.100
unct+0.100
iece+0.099
ring+0.098
Neg logit contributions
 tuna-0.052
 Wally-0.045
 named-0.042
 shrimp-0.041
 Otto-0.040
Tokenchair
Token ID16337
Feature activation+2.49e+00

Pos logit contributions
appointed+0.192
unct+0.183
vation+0.169
arse+0.165
irds+0.162
Neg logit contributions
 named-0.188
 and-0.178
 tuna-0.154
 who-0.154
 called-0.153
Token were
Token ID547
Feature activation+7.60e-01

Pos logit contributions
appointed+0.129
amas+0.116
unct+0.115
irds+0.112
chair+0.110
Neg logit contributions
 Wally-0.071
 named-0.071
 tuna-0.067
 Finn-0.063
 shrimp-0.061
Token the
Token ID262
Feature activation+1.37e+00

Pos logit contributions
appointed+0.036
chair+0.035
unct+0.033
irds+0.033
 halls+0.031
Neg logit contributions
 Wally-0.026
 tuna-0.025
 Sebastian-0.024
 shrimp-0.024
 fisher-0.023
Token best
Token ID1266
Feature activation+1.73e+00

Pos logit contributions
appointed+0.060
chair+0.060
irds+0.057
 halls+0.056
cases+0.055
Neg logit contributions
 Wally-0.049
 tuna-0.049
 fisher-0.046
 shrimp-0.045
 Sebastian-0.044
Token of
Token ID286
Feature activation-2.85e-01

Pos logit contributions
appointed+0.102
unct+0.095
iece+0.087
irds+0.087
flash+0.087
Neg logit contributions
 tuna-0.044
 Wally-0.040
 Finn-0.036
 shrimp-0.036
 Otto-0.035
Token when
Token ID618
Feature activation-5.08e-01

Pos logit contributions
 Wally+0.044
 fisher+0.042
 Sebastian+0.042
 tuna+0.042
 shrimp+0.042
Neg logit contributions
chair-0.039
appointed-0.036
clock-0.033
irds-0.033
ring-0.032
Token he
Token ID339
Feature activation-3.15e-01

Pos logit contributions
 Wally+0.015
 tuna+0.014
 Sebastian+0.014
 fisher+0.013
 shrimp+0.013
Neg logit contributions
appointed-0.025
chair-0.024
unct-0.023
irds-0.022
 halls-0.022
Token was
Token ID373
Feature activation+1.09e+00

Pos logit contributions
 Wally+0.012
 tuna+0.012
 Sebastian+0.012
 shrimp+0.011
 fisher+0.011
Neg logit contributions
appointed-0.013
chair-0.013
unct-0.012
irds-0.011
 halls-0.011
Token playing
Token ID2712
Feature activation-1.11e+00

Pos logit contributions
appointed+0.050
chair+0.050
irds+0.047
unct+0.045
amas+0.044
Neg logit contributions
 Wally-0.038
 tuna-0.035
 Sebastian-0.035
 shrimp-0.035
 fisher-0.034
Token make
Token ID787
Feature activation+1.53e+00

Pos logit contributions
 Wally+0.038
 Sebastian+0.037
 shrimp+0.036
 tuna+0.036
 fisher+0.035
Neg logit contributions
chair-0.049
appointed-0.046
 halls-0.043
iece-0.042
amas-0.042
Token-
Token ID12
Feature activation+4.98e+00

Pos logit contributions
appointed+0.080
chair+0.071
unct+0.071
irds+0.069
flash+0.067
Neg logit contributions
 tuna-0.050
 Wally-0.049
 shrimp-0.046
 Sebastian-0.045
 Finn-0.043
Tokenbel
Token ID6667
Feature activation+1.96e+00

Pos logit contributions
appointed+0.199
amas+0.197
 ceiling+0.181
 doorstep+0.179
unct+0.174
Neg logit contributions
 who-0.180
 named-0.172
 shrimp-0.169
 fish-0.161
 like-0.160
Tokenieve
Token ID12311
Feature activation-1.34e-01

Pos logit contributions
appointed+0.089
irds+0.085
chair+0.083
 roof+0.080
EEP+0.079
Neg logit contributions
 tuna-0.061
 Wally-0.060
 fisher-0.058
 Sebastian-0.055
 shrimp-0.054
Token at
Token ID379
Feature activation+4.02e-01

Pos logit contributions
 Wally+0.005
 tuna+0.004
 Sebastian+0.004
 shrimp+0.004
 fisher+0.004
Neg logit contributions
appointed-0.006
chair-0.006
irds-0.006
 halls-0.005
unct-0.005
Token home
Token ID1363
Feature activation+8.78e-01

Pos logit contributions
appointed+0.020
chair+0.019
irds+0.018
unct+0.018
 halls+0.017
Neg logit contributions
 Wally-0.013
 tuna-0.013
 Sebastian-0.012
 shrimp-0.012
 fisher-0.012
Token,
Token ID11
Feature activation-1.22e+00

Pos logit contributions
appointed+0.046
chair+0.043
unct+0.042
irds+0.041
flash+0.040
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token a
Token ID257
Feature activation-1.86e-01

Pos logit contributions
appointed+0.105
chair+0.095
irds+0.093
unct+0.091
amas+0.090
Neg logit contributions
 tuna-0.045
 shrimp-0.045
 Wally-0.043
 Sebastian-0.041
 fish-0.039
Token
Token ID220
Feature activation+4.64e+00

Pos logit contributions
 Wally+0.009
 tuna+0.008
 Sebastian+0.008
 shrimp+0.008
 fisher+0.008
Neg logit contributions
appointed-0.006
chair-0.006
irds-0.005
unct-0.005
 halls-0.005
Token light
Token ID1657
Feature activation+7.06e-01

Pos logit contributions
unct+0.177
appointed+0.174
 cheeks+0.172
erved+0.165
 forehead+0.163
Neg logit contributions
 Wally-0.144
 Sebastian-0.142
 tuna-0.140
 shrimp-0.139
 fisher-0.136
Token
Token ID220
Feature activation+4.83e+00

Pos logit contributions
appointed+0.030
chair+0.027
unct+0.026
irds+0.026
 halls+0.025
Neg logit contributions
 Wally-0.027
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.025
Token club
Token ID3430
Feature activation+6.41e-01

Pos logit contributions
unct+0.180
appointed+0.177
 cheeks+0.175
erved+0.170
 halls+0.167
Neg logit contributions
 Wally-0.147
 tuna-0.147
 Sebastian-0.144
inn-0.143
 shrimp-0.141
Token
Token ID220
Feature activation+4.95e+00

Pos logit contributions
appointed+0.031
chair+0.029
unct+0.028
irds+0.028
 halls+0.027
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.019
 Sebastian-0.019
 fisher-0.019
Token spinning
Token ID19493
Feature activation+7.59e-01

Pos logit contributions
unct+0.185
 cheeks+0.180
appointed+0.179
 roof+0.174
 halls+0.174
Neg logit contributions
 Wally-0.153
inn-0.153
 tuna-0.151
 Sebastian-0.147
 Finn-0.144
Token
Token ID220
Feature activation+5.08e+00

Pos logit contributions
appointed+0.036
chair+0.034
unct+0.033
irds+0.032
 halls+0.031
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token in
Token ID287
Feature activation+1.40e-01

Pos logit contributions
unct+0.193
appointed+0.184
 cheeks+0.181
 halls+0.181
 roof+0.180
Neg logit contributions
 Wally-0.159
 tuna-0.157
inn-0.157
 Sebastian-0.149
 Finn-0.149
Token the
Token ID262
Feature activation+1.56e+00

Pos logit contributions
appointed+0.007
chair+0.006
unct+0.006
irds+0.006
 halls+0.006
Neg logit contributions
 Wally-0.005
 tuna-0.005
 Sebastian-0.004
 shrimp-0.004
 fisher-0.004
Token sky
Token ID6766
Feature activation+1.87e+00

Pos logit contributions
appointed+0.051
irds+0.048
unct+0.047
chair+0.045
cases+0.044
Neg logit contributions
 tuna-0.076
 Wally-0.075
 shrimp-0.075
 Sebastian-0.068
 named-0.068
Token way
Token ID835
Feature activation+1.82e+00

Pos logit contributions
appointed+0.008
chair+0.008
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.009
 tuna-0.009
 shrimp-0.009
 Sebastian-0.009
 fisher-0.009
Token.
Token ID13
Feature activation+1.40e+00

Pos logit contributions
appointed+0.108
unct+0.102
chair+0.100
flash+0.097
amas+0.097
Neg logit contributions
 Wally-0.034
 Sebastian-0.031
 tuna-0.030
 named-0.029
 shrimp-0.029
Token This
Token ID770
Feature activation+2.99e-01

Pos logit contributions
unct+0.068
appointed+0.067
irds+0.064
chair+0.062
ring+0.061
Neg logit contributions
 Wally-0.046
 tuna-0.045
 shrimp-0.041
 Finn-0.040
 Sebastian-0.039
Token made
Token ID925
Feature activation+2.72e+00

Pos logit contributions
appointed+0.014
chair+0.013
unct+0.012
irds+0.012
 halls+0.012
Neg logit contributions
 Wally-0.010
 tuna-0.010
 shrimp-0.010
 Sebastian-0.010
 fisher-0.009
Token the
Token ID262
Feature activation+1.98e+00

Pos logit contributions
appointed+0.191
unct+0.182
chair+0.180
irds+0.179
age+0.178
Neg logit contributions
 tuna-0.044
 Wally-0.039
 Finn-0.031
 shrimp-0.028
 Nem-0.023
Token dress
Token ID6576
Feature activation+4.94e+00

Pos logit contributions
appointed+0.092
irds+0.088
cases+0.083
chair+0.083
rolling+0.083
Neg logit contributions
 tuna-0.065
 shrimp-0.065
 named-0.057
 Wally-0.056
 Sebastian-0.054
Token look
Token ID804
Feature activation+1.12e+00

Pos logit contributions
appointed+0.265
unct+0.261
irds+0.249
anna+0.248
EEP+0.244
Neg logit contributions
 like-0.132
 and-0.117
 but-0.094
 just-0.091
 also-0.086
Token different
Token ID1180
Feature activation+7.23e-01

Pos logit contributions
appointed+0.052
chair+0.050
irds+0.048
unct+0.047
oga+0.046
Neg logit contributions
 tuna-0.037
 Wally-0.036
 shrimp-0.035
 Sebastian-0.034
 fisher-0.032
Token,
Token ID11
Feature activation-6.69e-01

Pos logit contributions
appointed+0.041
chair+0.039
unct+0.037
irds+0.037
flash+0.036
Neg logit contributions
 Wally-0.018
 tuna-0.017
 shrimp-0.017
 Sebastian-0.016
 fisher-0.015
Token with
Token ID351
Feature activation+6.48e-01

Pos logit contributions
 Wally+0.020
 fisher+0.019
 Sebastian+0.019
 tuna+0.018
 shrimp+0.018
Neg logit contributions
appointed-0.034
chair-0.033
bows-0.030
 halls-0.030
irds-0.030
Token a
Token ID257
Feature activation+5.14e-01

Pos logit contributions
appointed+0.028
chair+0.026
unct+0.024
irds+0.024
amas+0.023
Neg logit contributions
 Wally-0.025
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.023
Token blanket
Token ID18447
Feature activation+7.66e-01

Pos logit contributions
appointed+0.001
chair+0.001
irds+0.001
unct+0.001
 halls+0.001
Neg logit contributions
 Wally-0.001
 tuna-0.001
 Sebastian-0.001
 shrimp-0.001
 fisher-0.001
Token before
Token ID878
Feature activation+8.48e-01

Pos logit contributions
appointed+0.039
chair+0.036
unct+0.035
irds+0.035
 halls+0.034
Neg logit contributions
 Wally-0.024
 tuna-0.022
 shrimp-0.021
 Sebastian-0.021
 fisher-0.020
Token it
Token ID340
Feature activation+2.31e+00

Pos logit contributions
appointed+0.041
chair+0.039
irds+0.037
unct+0.037
flash+0.036
Neg logit contributions
 Wally-0.028
 tuna-0.027
 shrimp-0.026
 Sebastian-0.026
 fisher-0.024
Token was
Token ID373
Feature activation+3.13e-02

Pos logit contributions
appointed+0.114
irds+0.109
unct+0.106
chair+0.103
flash+0.101
Neg logit contributions
 Wally-0.068
 tuna-0.067
 named-0.060
 shrimp-0.059
 Finn-0.058
Token torn
Token ID12445
Feature activation+6.88e-01

Pos logit contributions
appointed+0.001
chair+0.001
unct+0.001
irds+0.001
 halls+0.001
Neg logit contributions
 Wally-0.001
 tuna-0.001
 Sebastian-0.001
 shrimp-0.001
 fisher-0.001
Token
Token ID220
Feature activation+4.93e+00

Pos logit contributions
appointed+0.035
chair+0.033
irds+0.032
unct+0.032
 halls+0.031
Neg logit contributions
 Wally-0.021
 tuna-0.020
 Sebastian-0.019
 shrimp-0.019
 fisher-0.018
Token
Token ID198
Feature activation+9.82e-01

Pos logit contributions
 halls+0.176
unct+0.174
chair+0.171
 floors+0.164
appointed+0.159
Neg logit contributions
inn-0.168
 Wally-0.163
 tuna-0.161
 Sebastian-0.155
 shrimp-0.146
Token
Token ID198
Feature activation+4.03e-01

Pos logit contributions
appointed+0.054
chair+0.051
unct+0.051
irds+0.050
bows+0.050
Neg logit contributions
 Wally-0.025
 tuna-0.023
 Sebastian-0.022
 shrimp-0.021
 Finn-0.020
TokenS
Token ID50
Feature activation+1.61e+00

Pos logit contributions
appointed+0.016
chair+0.015
unct+0.014
irds+0.014
 halls+0.013
Neg logit contributions
 Wally-0.017
 tuna-0.016
 Sebastian-0.016
 shrimp-0.016
 fisher-0.016
Tokenue
Token ID518
Feature activation+1.33e+00

Pos logit contributions
appointed+0.074
chair+0.074
 halls+0.068
 ceiling+0.065
flash+0.065
Neg logit contributions
 Wally-0.055
 shrimp-0.052
 tuna-0.052
 Sebastian-0.051
 fisher-0.048
Token decided
Token ID3066
Feature activation+4.49e-01

Pos logit contributions
appointed+0.063
chair+0.059
irds+0.058
amas+0.056
flash+0.056
Neg logit contributions
 Wally-0.043
 tuna-0.042
 Sebastian-0.039
 shrimp-0.039
 named-0.039
Token from
Token ID422
Feature activation+1.54e+00

Pos logit contributions
 Wally+0.027
 tuna+0.026
 Sebastian+0.025
 fisher+0.025
 shrimp+0.025
Neg logit contributions
appointed-0.031
chair-0.029
unct-0.027
 halls-0.026
clock-0.025
Token home
Token ID1363
Feature activation+1.40e+00

Pos logit contributions
appointed+0.081
irds+0.077
amas+0.075
EEP+0.075
cases+0.074
Neg logit contributions
 Wally-0.046
 tuna-0.044
 shrimp-0.042
 Sebastian-0.039
 named-0.039
Token and
Token ID290
Feature activation+2.20e+00

Pos logit contributions
appointed+0.088
chair+0.084
irds+0.081
unct+0.080
flash+0.080
Neg logit contributions
 Wally-0.023
 tuna-0.021
 shrimp-0.020
 Sebastian-0.020
 named-0.019
Token eventually
Token ID4191
Feature activation+1.82e+00

Pos logit contributions
amas+0.128
irds+0.126
appointed+0.123
oga+0.122
akes+0.122
Neg logit contributions
 named-0.047
 Wally-0.044
 Finn-0.037
 Sebastian-0.037
 Pete-0.035
Token stumbled
Token ID24241
Feature activation+1.22e+00

Pos logit contributions
appointed+0.100
chair+0.097
amas+0.093
clock+0.093
bars+0.092
Neg logit contributions
 Wally-0.048
 Sebastian-0.041
 shrimp-0.041
 named-0.040
 tuna-0.039
Token upon
Token ID2402
Feature activation+4.93e+00

Pos logit contributions
appointed+0.054
chair+0.053
unct+0.049
irds+0.049
clock+0.047
Neg logit contributions
 Wally-0.043
 tuna-0.040
 Sebastian-0.039
 shrimp-0.039
 Finn-0.038
Token a
Token ID257
Feature activation-2.23e-01

Pos logit contributions
irds+0.376
raid+0.363
rolling+0.359
ring+0.357
anna+0.356
Neg logit contributions
 tuna-0.023
 his-0.019
 Wally-0.016
 like-0.013
 shrimp-0.005
Token weird
Token ID7650
Feature activation-7.20e-01

Pos logit contributions
 Wally+0.010
 Sebastian+0.010
 tuna+0.010
 fisher+0.010
 shrimp+0.009
Neg logit contributions
appointed-0.007
chair-0.007
unct-0.006
 halls-0.006
irds-0.006
Token stage
Token ID3800
Feature activation+5.67e-01

Pos logit contributions
 Wally+0.031
 Sebastian+0.029
 tuna+0.028
 fisher+0.028
 Finn+0.028
Neg logit contributions
chair-0.026
appointed-0.026
unct-0.023
 roof-0.023
 ceiling-0.023
Token.
Token ID13
Feature activation+2.89e+00

Pos logit contributions
appointed+0.029
chair+0.027
unct+0.026
irds+0.026
flash+0.025
Neg logit contributions
 Wally-0.017
 tuna-0.017
 shrimp-0.016
 Sebastian-0.016
 fisher-0.015
Token On
Token ID1550
Feature activation+1.32e+00

Pos logit contributions
unct+0.146
iece+0.139
irds+0.137
arse+0.137
vation+0.134
Neg logit contributions
 Wally-0.092
 tuna-0.085
 Otto-0.078
 shrimp-0.077
 named-0.076
Token song
Token ID3496
Feature activation+2.38e+00

Pos logit contributions
appointed+0.141
unct+0.132
bars+0.127
cases+0.126
rolling+0.125
Neg logit contributions
 fish-0.115
 shrimp-0.111
 tuna-0.111
 and-0.110
 named-0.104
Token and
Token ID290
Feature activation+1.33e+00

Pos logit contributions
appointed+0.149
unct+0.141
 halls+0.130
amas+0.129
rolling+0.129
Neg logit contributions
 Wally-0.047
 tuna-0.045
 shrimp-0.039
 Finn-0.035
 named-0.035
Token make
Token ID787
Feature activation+3.93e+00

Pos logit contributions
appointed+0.056
chair+0.052
amas+0.051
unct+0.049
cases+0.048
Neg logit contributions
 Wally-0.054
 tuna-0.051
 shrimp-0.050
 Sebastian-0.048
 fisher-0.047
Token the
Token ID262
Feature activation+2.71e+00

Pos logit contributions
appointed+0.247
bars+0.226
chair+0.223
cases+0.217
ooter+0.217
Neg logit contributions
 tuna-0.095
 fish-0.089
 shrimp-0.086
 Wally-0.086
 them-0.084
Token toy
Token ID13373
Feature activation+3.64e+00

Pos logit contributions
cases+0.133
appointed+0.128
bars+0.126
chair+0.119
limits+0.114
Neg logit contributions
 fish-0.096
 tuna-0.095
 shrimp-0.094
 Wally-0.084
 fisherman-0.083
Token chain
Token ID6333
Feature activation+4.93e+00

Pos logit contributions
appointed+0.187
unct+0.154
bars+0.151
amas+0.150
cases+0.144
Neg logit contributions
 fish-0.125
 tuna-0.111
 named-0.110
 like-0.107
 and-0.106
Token dance
Token ID9280
Feature activation+2.06e+00

Pos logit contributions
appointed+0.261
unct+0.251
amas+0.235
olk+0.234
chair+0.231
Neg logit contributions
 like-0.141
 and-0.125
 fish-0.110
 his-0.108
 tuna-0.105
Token in
Token ID287
Feature activation-1.98e-01

Pos logit contributions
appointed+0.124
unct+0.116
chair+0.109
bars+0.107
irds+0.106
Neg logit contributions
 tuna-0.046
 Wally-0.046
 shrimp-0.042
 like-0.037
 Finn-0.037
Token the
Token ID262
Feature activation+2.13e+00

Pos logit contributions
 Wally+0.007
 tuna+0.007
 Sebastian+0.007
 shrimp+0.007
 fisher+0.007
Neg logit contributions
appointed-0.009
chair-0.008
irds-0.008
unct-0.008
 halls-0.008
Token air
Token ID1633
Feature activation+9.03e-01

Pos logit contributions
cases+0.090
appointed+0.086
irds+0.085
owed+0.080
bars+0.079
Neg logit contributions
 tuna-0.091
 shrimp-0.089
 fish-0.082
 Wally-0.081
 named-0.078
Token.
Token ID13
Feature activation+1.49e+00

Pos logit contributions
appointed+0.047
chair+0.045
unct+0.043
irds+0.042
 halls+0.041
Neg logit contributions
 Wally-0.026
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.023
Token louder
Token ID27089
Feature activation+1.05e+00

Pos logit contributions
appointed+0.023
chair+0.022
irds+0.020
unct+0.020
amas+0.019
Neg logit contributions
 Wally-0.022
 tuna-0.022
 Sebastian-0.021
 shrimp-0.021
 fisher-0.020
Token.
Token ID13
Feature activation+8.26e-01

Pos logit contributions
appointed+0.059
chair+0.056
unct+0.053
irds+0.053
amas+0.052
Neg logit contributions
 Wally-0.025
 tuna-0.025
 Sebastian-0.024
 shrimp-0.024
 fisher-0.023
Token Suddenly
Token ID24975
Feature activation+6.11e-01

Pos logit contributions
appointed+0.037
chair+0.035
unct+0.035
irds+0.034
ring+0.034
Neg logit contributions
 Wally-0.031
 tuna-0.029
 Sebastian-0.028
 Finn-0.027
 Fin-0.027
Token,
Token ID11
Feature activation+7.17e-01

Pos logit contributions
appointed+0.034
chair+0.032
unct+0.030
irds+0.030
amas+0.030
Neg logit contributions
 Wally-0.016
 tuna-0.016
 shrimp-0.015
 Sebastian-0.015
 fisher-0.014
Token the
Token ID262
Feature activation+3.56e+00

Pos logit contributions
appointed+0.039
chair+0.037
irds+0.035
unct+0.035
ring+0.035
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.019
 Sebastian-0.019
 fisher-0.018
Token ringing
Token ID32333
Feature activation+4.85e+00

Pos logit contributions
ics+0.196
irds+0.193
ance+0.190
owed+0.183
bows+0.183
Neg logit contributions
 and-0.075
 also-0.074
 tuna-0.073
 who-0.073
 named-0.069
Token stopped
Token ID5025
Feature activation+4.38e-01

Pos logit contributions
unct+0.235
appointed+0.228
rolling+0.215
irds+0.213
ics+0.211
Neg logit contributions
 named-0.126
 and-0.122
 his-0.112
 made-0.112
 who-0.108
Token and
Token ID290
Feature activation+7.60e-01

Pos logit contributions
appointed+0.024
chair+0.023
irds+0.022
unct+0.021
 halls+0.021
Neg logit contributions
 Wally-0.012
 tuna-0.011
 shrimp-0.011
 Sebastian-0.011
 fisher-0.010
Token a
Token ID257
Feature activation+7.21e-01

Pos logit contributions
appointed+0.041
chair+0.039
irds+0.037
amas+0.036
 halls+0.036
Neg logit contributions
 Wally-0.022
 tuna-0.021
 shrimp-0.020
 Sebastian-0.020
 Finn-0.019
Token funny
Token ID8258
Feature activation+2.19e+00

Pos logit contributions
appointed+0.027
chair+0.025
unct+0.024
irds+0.023
ring+0.022
Neg logit contributions
 Wally-0.032
 tuna-0.031
 shrimp-0.030
 Sebastian-0.030
 fisher-0.029
Token voice
Token ID3809
Feature activation+2.76e-01

Pos logit contributions
appointed+0.097
cases+0.085
irds+0.083
bars+0.082
unct+0.082
Neg logit contributions
 tuna-0.088
 shrimp-0.084
 Wally-0.082
 shark-0.081
 fisher-0.078
Token I
Token ID314
Feature activation-1.80e-03

Pos logit contributions
appointed+0.095
unct+0.092
chair+0.084
 halls+0.083
iece+0.083
Neg logit contributions
 Wally-0.069
 Sebastian-0.066
 tuna-0.065
 shrimp-0.064
 Fin-0.061
Token see
Token ID766
Feature activation+1.79e+00

Pos logit contributions
 Wally+0.000
 tuna+0.000
 Sebastian+0.000
 shrimp+0.000
 fisher+0.000
Neg logit contributions
appointed-0.000
chair-0.000
unct-0.000
irds-0.000
 halls-0.000
Token a
Token ID257
Feature activation-1.86e-01

Pos logit contributions
appointed+0.105
chair+0.095
irds+0.093
unct+0.091
amas+0.090
Neg logit contributions
 tuna-0.045
 shrimp-0.045
 Wally-0.043
 Sebastian-0.041
 fish-0.039
Token
Token ID220
Feature activation+4.64e+00

Pos logit contributions
 Wally+0.009
 tuna+0.008
 Sebastian+0.008
 shrimp+0.008
 fisher+0.008
Neg logit contributions
appointed-0.006
chair-0.006
irds-0.005
unct-0.005
 halls-0.005
Token light
Token ID1657
Feature activation+7.06e-01

Pos logit contributions
unct+0.177
appointed+0.174
 cheeks+0.172
erved+0.165
 forehead+0.163
Neg logit contributions
 Wally-0.144
 Sebastian-0.142
 tuna-0.140
 shrimp-0.139
 fisher-0.136
Token
Token ID220
Feature activation+4.83e+00

Pos logit contributions
appointed+0.030
chair+0.027
unct+0.026
irds+0.026
 halls+0.025
Neg logit contributions
 Wally-0.027
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.025
Token club
Token ID3430
Feature activation+6.41e-01

Pos logit contributions
unct+0.180
appointed+0.177
 cheeks+0.175
erved+0.170
 halls+0.167
Neg logit contributions
 Wally-0.147
 tuna-0.147
 Sebastian-0.144
inn-0.143
 shrimp-0.141
Token
Token ID220
Feature activation+4.95e+00

Pos logit contributions
appointed+0.031
chair+0.029
unct+0.028
irds+0.028
 halls+0.027
Neg logit contributions
 Wally-0.021
 tuna-0.020
 shrimp-0.019
 Sebastian-0.019
 fisher-0.019
Token spinning
Token ID19493
Feature activation+7.59e-01

Pos logit contributions
unct+0.185
 cheeks+0.180
appointed+0.179
 roof+0.174
 halls+0.174
Neg logit contributions
 Wally-0.153
inn-0.153
 tuna-0.151
 Sebastian-0.147
 Finn-0.144
Token
Token ID220
Feature activation+5.08e+00

Pos logit contributions
appointed+0.036
chair+0.034
unct+0.033
irds+0.032
 halls+0.031
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token in
Token ID287
Feature activation+1.40e-01

Pos logit contributions
unct+0.193
appointed+0.184
 cheeks+0.181
 halls+0.181
 roof+0.180
Neg logit contributions
 Wally-0.159
 tuna-0.157
inn-0.157
 Sebastian-0.149
 Finn-0.149
Token is
Token ID318
Feature activation-3.19e+00

Pos logit contributions
 Wally+0.092
 Sebastian+0.091
 fisher+0.090
 named+0.088
 Finn+0.087
Neg logit contributions
est-0.069
ness-0.067
bows-0.066
 sounds-0.062
 same-0.061
Token shrimp
Token ID30060
Feature activation-1.96e+00

Pos logit contributions
 fisher+0.126
 fisherman+0.120
Dave+0.119
 Sebastian+0.117
olph+0.113
Neg logit contributions
clock-0.087
 ceiling-0.087
appointed-0.084
 Lightning-0.083
 walls-0.083
Token.
Token ID13
Feature activation-2.31e+00

Pos logit contributions
 Sebastian+0.075
 Wally+0.072
 fisher+0.071
 Finn+0.065
 Otto+0.065
Neg logit contributions
chair-0.076
appointed-0.066
bows-0.065
oga-0.063
cases-0.062
Token It
Token ID632
Feature activation-2.10e+00

Pos logit contributions
 Wally+0.096
 fisher+0.093
 shrimp+0.092
 tuna+0.092
 Sebastian+0.091
Neg logit contributions
appointed-0.069
chair-0.068
irds-0.065
ring-0.065
clock-0.063
Token is
Token ID318
Feature activation-3.92e+00

Pos logit contributions
 Sebastian+0.106
 fisher+0.103
 Wally+0.102
 fisherman+0.098
Dave+0.096
Neg logit contributions
bows-0.043
appointed-0.043
chair-0.043
als-0.043
 sounds-0.041
Token a
Token ID257
Feature activation-3.96e+00

Pos logit contributions
 fisher+0.160
Dave+0.154
 Sebastian+0.154
 fisherman+0.152
 Wally+0.148
Neg logit contributions
clock-0.097
 sounds-0.097
 walls-0.096
 ceiling-0.095
 biggest-0.094
Token small
Token ID1402
Feature activation-3.65e+00

Pos logit contributions
 Sebastian+0.163
 Wally+0.158
 Fin+0.150
 Finn+0.147
 Otto+0.146
Neg logit contributions
 roof-0.115
 ceiling-0.112
 wall-0.108
 whole-0.107
 job-0.107
Token fish
Token ID5916
Feature activation-2.60e+00

Pos logit contributions
 Wally+0.105
Stan+0.100
 Sebastian+0.099
Dave+0.093
 Fin+0.093
Neg logit contributions
ness-0.164
est-0.157
 roof-0.147
 ceiling-0.147
 wall-0.140
Token that
Token ID326
Feature activation-2.07e+00

Pos logit contributions
 Sebastian+0.106
 Wally+0.104
 Otto+0.096
 Finn+0.093
 fisher+0.092
Neg logit contributions
chair-0.094
ring-0.083
age-0.081
bows-0.078
ong-0.075
Token lives
Token ID3160
Feature activation-1.22e+00

Pos logit contributions
 Sebastian+0.084
 Wally+0.080
 fisher+0.077
 Otto+0.073
 tuna+0.073
Neg logit contributions
chair-0.066
appointed-0.064
 halls-0.061
 ceiling-0.060
als-0.060
Token in
Token ID287
Feature activation-1.78e+00

Pos logit contributions
 Sebastian+0.045
 Wally+0.045
 fisher+0.041
 tuna+0.041
 Otto+0.040
Neg logit contributions
chair-0.049
appointed-0.049
 halls-0.046
irds-0.045
clock-0.045
TokenThis
Token ID1212
Feature activation-2.31e+00

Pos logit contributions
 Wally+0.049
 tuna+0.045
 Sebastian+0.045
 Finn+0.043
 fisher+0.043
Neg logit contributions
appointed-0.045
chair-0.043
irds-0.040
ring-0.040
clock-0.039
Token is
Token ID318
Feature activation-3.19e+00

Pos logit contributions
 Wally+0.092
 Sebastian+0.091
 fisher+0.090
 named+0.088
 Finn+0.087
Neg logit contributions
est-0.069
ness-0.067
bows-0.066
 sounds-0.062
 same-0.061
Token shrimp
Token ID30060
Feature activation-1.96e+00

Pos logit contributions
 fisher+0.126
 fisherman+0.120
Dave+0.119
 Sebastian+0.117
olph+0.113
Neg logit contributions
clock-0.087
 ceiling-0.087
appointed-0.084
 Lightning-0.083
 walls-0.083
Token.
Token ID13
Feature activation-2.31e+00

Pos logit contributions
 Sebastian+0.075
 Wally+0.072
 fisher+0.071
 Finn+0.065
 Otto+0.065
Neg logit contributions
chair-0.076
appointed-0.066
bows-0.065
oga-0.063
cases-0.062
Token It
Token ID632
Feature activation-2.10e+00

Pos logit contributions
 Wally+0.096
 fisher+0.093
 shrimp+0.092
 tuna+0.092
 Sebastian+0.091
Neg logit contributions
appointed-0.069
chair-0.068
irds-0.065
ring-0.065
clock-0.063
Token is
Token ID318
Feature activation-3.92e+00

Pos logit contributions
 Sebastian+0.106
 fisher+0.103
 Wally+0.102
 fisherman+0.098
Dave+0.096
Neg logit contributions
bows-0.043
appointed-0.043
chair-0.043
als-0.043
 sounds-0.041
Token a
Token ID257
Feature activation-3.96e+00

Pos logit contributions
 fisher+0.160
Dave+0.154
 Sebastian+0.154
 fisherman+0.152
 Wally+0.148
Neg logit contributions
clock-0.097
 sounds-0.097
 walls-0.096
 ceiling-0.095
 biggest-0.094
Token small
Token ID1402
Feature activation-3.65e+00

Pos logit contributions
 Sebastian+0.163
 Wally+0.158
 Fin+0.150
 Finn+0.147
 Otto+0.146
Neg logit contributions
 roof-0.115
 ceiling-0.112
 wall-0.108
 whole-0.107
 job-0.107
Token fish
Token ID5916
Feature activation-2.60e+00

Pos logit contributions
 Wally+0.105
Stan+0.100
 Sebastian+0.099
Dave+0.093
 Fin+0.093
Neg logit contributions
ness-0.164
est-0.157
 roof-0.147
 ceiling-0.147
 wall-0.140
Token that
Token ID326
Feature activation-2.07e+00

Pos logit contributions
 Sebastian+0.106
 Wally+0.104
 Otto+0.096
 Finn+0.093
 fisher+0.092
Neg logit contributions
chair-0.094
ring-0.083
age-0.081
bows-0.078
ong-0.075
Token lives
Token ID3160
Feature activation-1.22e+00

Pos logit contributions
 Sebastian+0.084
 Wally+0.080
 fisher+0.077
 Otto+0.073
 tuna+0.073
Neg logit contributions
chair-0.066
appointed-0.064
 halls-0.061
 ceiling-0.060
als-0.060
Token juice
Token ID13135
Feature activation-3.24e+00

Pos logit contributions
 Sebastian+0.094
 Wally+0.093
 Fin+0.086
 Finn+0.086
 Otto+0.084
Neg logit contributions
 roof-0.071
chair-0.069
 ceiling-0.066
appointed-0.062
bows-0.061
Token.
Token ID13
Feature activation+9.74e-01

Pos logit contributions
 Sebastian+0.144
 fossil+0.139
 fisher+0.134
 Wally+0.130
 Fin+0.129
Neg logit contributions
chair-0.100
 ceiling-0.091
 roof-0.089
als-0.085
bows-0.085
Token Sam
Token ID3409
Feature activation-1.94e+00

Pos logit contributions
appointed+0.039
chair+0.037
unct+0.037
ring+0.035
filled+0.034
Neg logit contributions
 Wally-0.041
 tuna-0.038
 shrimp-0.038
 Sebastian-0.037
 Finn-0.037
Token wanted
Token ID2227
Feature activation-1.50e+00

Pos logit contributions
 Sebastian+0.081
 Wally+0.079
 fisher+0.078
 fossil+0.076
 tuna+0.075
Neg logit contributions
chair-0.068
appointed-0.064
cases-0.060
 halls-0.058
up-0.055
Token a
Token ID257
Feature activation-1.85e+00

Pos logit contributions
 fisher+0.072
 Sebastian+0.072
 fossil+0.071
 Wally+0.069
 fisherman+0.068
Neg logit contributions
chair-0.034
clock-0.033
up-0.032
 halls-0.032
ups-0.031
Token pizza
Token ID14256
Feature activation-3.90e+00

Pos logit contributions
 Sebastian+0.081
 Wally+0.078
 Fin+0.075
 Finn+0.073
 Otto+0.070
Neg logit contributions
chair-0.053
 roof-0.051
appointed-0.049
irds-0.048
 ABC-0.047
Token and
Token ID290
Feature activation-8.52e-01

Pos logit contributions
 Sebastian+0.179
 aimed+0.164
 fossil+0.164
 Wally+0.162
Jon+0.162
Neg logit contributions
chair-0.099
bye-0.090
clock-0.085
cakes-0.085
 job-0.085
Token a
Token ID257
Feature activation-2.30e+00

Pos logit contributions
 Wally+0.032
 Sebastian+0.032
 fisher+0.032
 fossil+0.031
 tuna+0.030
Neg logit contributions
chair-0.032
appointed-0.032
 halls-0.028
irds-0.028
clock-0.027
Token milk
Token ID7545
Feature activation-3.00e+00

Pos logit contributions
 Sebastian+0.102
 Wally+0.098
 Fin+0.093
 Finn+0.092
 Otto+0.090
Neg logit contributions
 roof-0.069
chair-0.065
 ceiling-0.062
appointed-0.060
clock-0.058
Token.
Token ID13
Feature activation+2.07e-01

Pos logit contributions
 fossil+0.144
 fisher+0.136
 Sebastian+0.136
 Wally+0.133
Bruce+0.131
Neg logit contributions
chair-0.075
ness-0.069
bows-0.067
est-0.067
 roof-0.067
Token Mom
Token ID11254
Feature activation-2.03e+00

Pos logit contributions
appointed+0.010
chair+0.010
unct+0.009
irds+0.009
 halls+0.009
Neg logit contributions
 Wally-0.007
 tuna-0.006
 shrimp-0.006
 Sebastian-0.006
 fisher-0.006
Token They
Token ID1119
Feature activation-1.19e+00

Pos logit contributions
 fisher+0.065
 fossil+0.065
 shrimp+0.065
 shark+0.063
 pearl+0.063
Neg logit contributions
appointed-0.044
chair-0.041
clock-0.040
 Everyone-0.038
irds-0.038
Token had
Token ID550
Feature activation-2.99e-01

Pos logit contributions
 Wally+0.051
 tuna+0.049
 Sebastian+0.048
 shrimp+0.047
 fisher+0.046
Neg logit contributions
chair-0.043
appointed-0.043
 halls-0.041
 roof-0.039
cases-0.039
Token a
Token ID257
Feature activation-1.30e+00

Pos logit contributions
 Wally+0.012
 Sebastian+0.011
 tuna+0.011
 fisher+0.011
 shrimp+0.011
Neg logit contributions
appointed-0.012
chair-0.011
 halls-0.011
clock-0.010
irds-0.010
Token lot
Token ID1256
Feature activation-5.95e-01

Pos logit contributions
 Sebastian+0.067
 Wally+0.066
 Finn+0.063
 Crab+0.062
 Otto+0.062
Neg logit contributions
 roof-0.033
appointed-0.031
chair-0.031
 ceiling-0.030
 ABC-0.028
Token of
Token ID286
Feature activation-6.10e-01

Pos logit contributions
 Wally+0.030
 Sebastian+0.030
 fisher+0.028
 tuna+0.028
 Otto+0.028
Neg logit contributions
chair-0.019
appointed-0.017
 halls-0.016
up-0.016
bows-0.016
Token choices
Token ID7747
Feature activation-3.89e+00

Pos logit contributions
 Wally+0.028
 Sebastian+0.027
 Crab+0.025
 fisher+0.025
 Otto+0.025
Neg logit contributions
appointed-0.022
chair-0.021
 halls-0.020
bows-0.019
clock-0.019
Token.
Token ID13
Feature activation-1.82e+00

Pos logit contributions
 Sebastian+0.216
Fin+0.208
 fossil+0.207
 Otto+0.206
 Wally+0.205
Neg logit contributions
bows-0.073
chair-0.071
 ceiling-0.062
ous-0.058
ring-0.057
Token They
Token ID1119
Feature activation-1.53e+00

Pos logit contributions
 fisher+0.081
 pearl+0.080
 fossil+0.080
 Sebastian+0.080
 shark+0.080
Neg logit contributions
chair-0.048
appointed-0.048
clock-0.048
 Everyone-0.047
 Lightning-0.045
Token were
Token ID547
Feature activation-1.08e+00

Pos logit contributions
 Wally+0.067
 Sebastian+0.064
 tuna+0.063
 fisher+0.060
 Otto+0.060
Neg logit contributions
chair-0.052
appointed-0.052
 halls-0.050
 roof-0.047
cases-0.047
Token not
Token ID407
Feature activation-2.89e-01

Pos logit contributions
 Wally+0.049
 Sebastian+0.048
 tuna+0.046
 fossil+0.046
 Crab+0.046
Neg logit contributions
appointed-0.036
 halls-0.030
clock-0.029
chair-0.028
 floors-0.028
Token bored
Token ID17533
Feature activation-3.70e-01

Pos logit contributions
 Wally+0.012
 tuna+0.011
 Sebastian+0.011
 shrimp+0.011
 fisher+0.011
Neg logit contributions
appointed-0.011
chair-0.011
unct-0.010
 halls-0.010
irds-0.010
Token.
Token ID13
Feature activation-1.25e+00

Pos logit contributions
 Wally+0.029
 Sebastian+0.028
 fisher+0.027
 tuna+0.027
 shrimp+0.026
Neg logit contributions
chair-0.033
appointed-0.032
bows-0.029
ring-0.029
bars-0.028
Token He
Token ID679
Feature activation-2.73e+00

Pos logit contributions
 shrimp+0.052
 Wally+0.050
 Sebastian+0.049
 tuna+0.049
 fisher+0.047
Neg logit contributions
chair-0.044
appointed-0.042
clock-0.039
 Lightning-0.039
irds-0.039
Token thinks
Token ID6834
Feature activation-1.34e+00

Pos logit contributions
 Sebastian+0.140
 Wally+0.137
 Otto+0.128
Dave+0.124
 Crab+0.124
Neg logit contributions
 sounds-0.065
 starts-0.060
 feels-0.057
bows-0.056
chair-0.055
Token it
Token ID340
Feature activation-3.82e-01

Pos logit contributions
 Sebastian+0.052
 Wally+0.051
 fisher+0.049
 fossil+0.048
 shrimp+0.048
Neg logit contributions
chair-0.049
irds-0.044
ong-0.043
 halls-0.042
 ceiling-0.042
Token is
Token ID318
Feature activation-3.06e+00

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 shrimp+0.016
 tuna+0.016
 fisher+0.016
Neg logit contributions
appointed-0.013
chair-0.013
bows-0.012
 halls-0.012
unct-0.011
Token a
Token ID257
Feature activation-3.89e+00

Pos logit contributions
Dave+0.154
olph+0.153
 fisherman+0.153
 fisher+0.152
inn+0.149
Neg logit contributions
 sounds-0.059
 biggest-0.059
appointed-0.053
 same-0.047
clock-0.044
Token cool
Token ID3608
Feature activation-2.10e+00

Pos logit contributions
 Wally+0.197
 Otto+0.195
 Sebastian+0.194
inn+0.192
 Finn+0.188
Neg logit contributions
 job-0.085
 song-0.080
 roof-0.076
 whole-0.076
 biggest-0.071
Token gun
Token ID2485
Feature activation-1.07e+00

Pos logit contributions
 Wally+0.085
 Otto+0.080
Dave+0.078
 Finn+0.077
Stan+0.076
Neg logit contributions
ness-0.077
est-0.072
 roof-0.070
bye-0.069
bows-0.067
Token.
Token ID13
Feature activation-1.47e+00

Pos logit contributions
 Wally+0.043
 Sebastian+0.042
 shrimp+0.040
 fisher+0.040
 tuna+0.040
Neg logit contributions
chair-0.042
appointed-0.039
bows-0.037
ring-0.036
clock-0.036
Token
Token ID198
Feature activation-3.93e-01

Pos logit contributions
 shrimp+0.067
 Wally+0.062
 tuna+0.062
 Sebastian+0.062
 pearl+0.061
Neg logit contributions
chair-0.046
appointed-0.045
 Lightning-0.042
clock-0.041
EEP-0.041
Token
Token ID198
Feature activation-6.65e-01

Pos logit contributions
 Wally+0.014
 shrimp+0.013
 tuna+0.013
 Sebastian+0.013
 fisher+0.013
Neg logit contributions
appointed-0.018
chair-0.017
irds-0.016
 halls-0.015
unct-0.015
Token lunch
Token ID9965
Feature activation-6.43e-01

Pos logit contributions
appointed+0.058
chair+0.054
irds+0.053
cases+0.051
ring+0.050
Neg logit contributions
 Wally-0.032
 tuna-0.032
 shrimp-0.030
 Sebastian-0.028
 Finn-0.026
Token.
Token ID13
Feature activation-4.55e-01

Pos logit contributions
 Wally+0.022
 Sebastian+0.020
 tuna+0.020
 fisher+0.020
 shrimp+0.020
Neg logit contributions
appointed-0.029
chair-0.029
irds-0.026
amas-0.026
 halls-0.026
Token One
Token ID1881
Feature activation+7.29e-01

Pos logit contributions
 Wally+0.018
 tuna+0.017
 shrimp+0.017
 fisher+0.017
 Sebastian+0.017
Neg logit contributions
appointed-0.018
chair-0.018
irds-0.016
 halls-0.015
unct-0.015
Token had
Token ID550
Feature activation+8.04e-01

Pos logit contributions
appointed+0.024
chair+0.021
unct+0.021
irds+0.020
flash+0.020
Neg logit contributions
 Wally-0.036
 tuna-0.036
 shrimp-0.035
 Sebastian-0.033
 fisher-0.033
Token a
Token ID257
Feature activation-1.37e+00

Pos logit contributions
appointed+0.039
chair+0.036
unct+0.035
irds+0.035
ring+0.033
Neg logit contributions
 Wally-0.027
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 fisher-0.024
Token tuna
Token ID38883
Feature activation-3.87e+00

Pos logit contributions
 Wally+0.063
 Sebastian+0.062
 fisher+0.059
 Fin+0.058
 Finn+0.057
Neg logit contributions
chair-0.043
appointed-0.040
 roof-0.040
 ceiling-0.038
 halls-0.036
Token sandwich
Token ID20433
Feature activation-2.85e+00

Pos logit contributions
 Sebastian+0.153
 Wally+0.141
Stan+0.138
Dave+0.138
 fisher+0.137
Neg logit contributions
chair-0.119
als-0.109
 ceiling-0.109
 roof-0.104
up-0.102
Token and
Token ID290
Feature activation-1.46e+00

Pos logit contributions
 fisher+0.112
 Sebastian+0.110
 Wally+0.108
 Finn+0.100
 fossil+0.100
Neg logit contributions
chair-0.115
bows-0.099
case-0.097
ring-0.096
cases-0.095
Token the
Token ID262
Feature activation-1.38e+00

Pos logit contributions
 Wally+0.049
 fisher+0.048
 Sebastian+0.048
 fossil+0.044
 tuna+0.043
Neg logit contributions
chair-0.060
appointed-0.059
 halls-0.054
irds-0.053
 roof-0.053
Token other
Token ID584
Feature activation-9.05e-01

Pos logit contributions
 Wally+0.041
 Sebastian+0.040
 fisher+0.037
 Otto+0.034
 Fin+0.034
Neg logit contributions
appointed-0.060
chair-0.060
 ceiling-0.056
unct-0.056
 halls-0.055
Token an
Token ID281
Feature activation-1.51e+00

Pos logit contributions
 Wally+0.035
 Sebastian+0.034
 fisher+0.032
 fossil+0.032
 shrimp+0.031
Neg logit contributions
appointed-0.033
chair-0.033
ness-0.030
 halls-0.030
clock-0.030
Token.
Token ID13
Feature activation+1.25e+00

Pos logit contributions
 Wally+0.014
 tuna+0.013
 shrimp+0.013
 Sebastian+0.013
 fisher+0.012
Neg logit contributions
appointed-0.015
chair-0.014
unct-0.013
irds-0.013
ring-0.013
Token This
Token ID770
Feature activation-1.08e+00

Pos logit contributions
appointed+0.058
unct+0.052
chair+0.051
 halls+0.050
cases+0.049
Neg logit contributions
 Wally-0.047
 tuna-0.044
 shrimp-0.043
 Sebastian-0.042
 Finn-0.041
Token shrimp
Token ID30060
Feature activation-3.97e-01

Pos logit contributions
 Sebastian+0.050
 Wally+0.050
 fisher+0.050
 Finn+0.047
 shrimp+0.046
Neg logit contributions
chair-0.032
appointed-0.029
est-0.027
oga-0.027
 roof-0.027
Token is
Token ID318
Feature activation-2.62e+00

Pos logit contributions
 Wally+0.016
 Sebastian+0.016
 tuna+0.016
 fisher+0.015
 shrimp+0.015
Neg logit contributions
appointed-0.015
chair-0.015
unct-0.013
irds-0.013
 halls-0.013
Token not
Token ID407
Feature activation-1.29e+00

Pos logit contributions
 Sebastian+0.121
 fisher+0.120
 fisherman+0.119
Dave+0.118
 Otto+0.111
Neg logit contributions
appointed-0.061
 same-0.055
clock-0.052
 walls-0.050
ong-0.049
Token a
Token ID257
Feature activation-3.87e+00

Pos logit contributions
 Wally+0.054
 Sebastian+0.053
 fisher+0.051
 Finn+0.050
 Otto+0.049
Neg logit contributions
appointed-0.043
ong-0.038
chair-0.037
ring-0.037
bows-0.036
Token food
Token ID2057
Feature activation-2.74e+00

Pos logit contributions
 Sebastian+0.167
 Wally+0.156
 Finn+0.151
 Otto+0.150
inn+0.148
Neg logit contributions
 roof-0.104
 same-0.097
 job-0.093
 whole-0.093
 ceiling-0.091
Token.
Token ID13
Feature activation-1.22e+00

Pos logit contributions
 Sebastian+0.125
 fisher+0.118
 Wally+0.117
 fossil+0.116
 Finn+0.113
Neg logit contributions
chair-0.091
 ceiling-0.073
 roof-0.072
case-0.072
clock-0.068
Token It
Token ID632
Feature activation-2.96e-01

Pos logit contributions
 Wally+0.056
 Sebastian+0.054
 tuna+0.054
 fisher+0.054
 shrimp+0.054
Neg logit contributions
chair-0.036
appointed-0.035
ring-0.033
irds-0.033
clock-0.031
Token is
Token ID318
Feature activation-3.14e+00

Pos logit contributions
 Wally+0.013
 Sebastian+0.013
 tuna+0.012
 shrimp+0.012
 fisher+0.012
Neg logit contributions
appointed-0.011
chair-0.010
unct-0.009
irds-0.009
 halls-0.009
Token a
Token ID257
Feature activation-3.71e+00

Pos logit contributions
 fisher+0.144
 Sebastian+0.142
Dave+0.140
 fisherman+0.134
 Wally+0.129
Neg logit contributions
appointed-0.075
 same-0.071
 ceiling-0.070
clock-0.069
 roof-0.067
Token wanted
Token ID2227
Feature activation-8.20e-01

Pos logit contributions
 Wally+0.055
 tuna+0.054
 Sebastian+0.052
 fossil+0.052
 fisher+0.051
Neg logit contributions
appointed-0.050
chair-0.049
cases-0.043
 halls-0.043
irds-0.043
Token to
Token ID284
Feature activation-6.69e-01

Pos logit contributions
 Wally+0.038
 Sebastian+0.037
 fisher+0.037
 fossil+0.036
 tuna+0.036
Neg logit contributions
chair-0.024
appointed-0.023
 halls-0.020
clock-0.020
filled-0.020
Token look
Token ID804
Feature activation-1.58e+00

Pos logit contributions
 tuna+0.029
 Wally+0.028
 shrimp+0.028
 Sebastian+0.028
 fossil+0.027
Neg logit contributions
appointed-0.024
chair-0.022
irds-0.020
unct-0.020
 halls-0.019
Token for
Token ID329
Feature activation-1.52e+00

Pos logit contributions
 Wally+0.072
 Sebastian+0.071
 fisher+0.070
 tuna+0.069
 Crab+0.069
Neg logit contributions
appointed-0.052
chair-0.051
cases-0.044
unct-0.042
ong-0.042
Token a
Token ID257
Feature activation-1.11e+00

Pos logit contributions
 fisher+0.058
 fossil+0.058
 fisherman+0.056
 Wally+0.055
 Sebastian+0.055
Neg logit contributions
chair-0.053
appointed-0.051
 walls-0.046
 halls-0.045
filled-0.045
Token sport
Token ID6332
Feature activation-3.86e+00

Pos logit contributions
 Wally+0.047
 Sebastian+0.047
 tuna+0.044
 fisher+0.043
 shrimp+0.043
Neg logit contributions
appointed-0.037
chair-0.036
irds-0.031
 ceiling-0.031
 roof-0.031
Token he
Token ID339
Feature activation-6.10e-01

Pos logit contributions
 Sebastian+0.173
 fossil+0.160
Stan+0.158
 Otto+0.156
 Wally+0.155
Neg logit contributions
chair-0.129
case-0.097
seek-0.093
 exam-0.093
 stair-0.092
Token could
Token ID714
Feature activation-8.91e-01

Pos logit contributions
 Wally+0.024
 tuna+0.024
 Sebastian+0.023
 shrimp+0.023
 fisher+0.023
Neg logit contributions
appointed-0.024
chair-0.024
irds-0.021
unct-0.021
 halls-0.021
Token do
Token ID466
Feature activation-1.25e+00

Pos logit contributions
 tuna+0.040
 Wally+0.040
 Sebastian+0.039
 shrimp+0.039
 fossil+0.038
Neg logit contributions
appointed-0.031
chair-0.029
irds-0.025
unct-0.025
 halls-0.025
Token.
Token ID13
Feature activation+8.70e-01

Pos logit contributions
 Wally+0.051
 Sebastian+0.050
 tuna+0.049
 shrimp+0.048
 fisher+0.048
Neg logit contributions
appointed-0.048
chair-0.047
 halls-0.042
cases-0.041
irds-0.041
Token Then
Token ID3244
Feature activation+6.14e-01

Pos logit contributions
appointed+0.041
unct+0.040
chair+0.040
irds+0.039
ring+0.038
Neg logit contributions
 Wally-0.027
 Sebastian-0.026
 tuna-0.025
 shrimp-0.023
 fisher-0.023
Token
Token ID198
Feature activation+1.62e-01

Pos logit contributions
 Wally+0.001
 tuna+0.001
 shrimp+0.001
 Sebastian+0.001
 fisher+0.001
Neg logit contributions
appointed-0.002
chair-0.002
irds-0.002
unct-0.002
 halls-0.002
Token"
Token ID1
Feature activation-2.08e-01

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.005
 tuna-0.005
 Sebastian-0.005
 shrimp-0.005
 fisher-0.005
TokenA
Token ID32
Feature activation-1.28e+00

Pos logit contributions
 Wally+0.008
 tuna+0.008
 Sebastian+0.008
 shrimp+0.008
 fisher+0.008
Neg logit contributions
appointed-0.009
chair-0.008
irds-0.007
unct-0.007
flash-0.007
Token mineral
Token ID18352
Feature activation-1.78e+00

Pos logit contributions
 Wally+0.043
 Sebastian+0.040
 Finn+0.035
 shrimp+0.035
 tuna+0.035
Neg logit contributions
chair-0.056
appointed-0.055
unct-0.050
est-0.050
up-0.048
Token is
Token ID318
Feature activation-2.64e+00

Pos logit contributions
 Wally+0.074
 Sebastian+0.074
 fisher+0.068
 Otto+0.065
 Finn+0.065
Neg logit contributions
chair-0.067
unct-0.057
appointed-0.056
up-0.056
 ceiling-0.055
Token a
Token ID257
Feature activation-3.83e+00

Pos logit contributions
 fisher+0.125
 Wally+0.123
 Sebastian+0.122
 fisherman+0.122
Dave+0.119
Neg logit contributions
 ceiling-0.061
 walls-0.060
ong-0.056
 roof-0.056
appointed-0.056
Token kind
Token ID1611
Feature activation-2.53e+00

Pos logit contributions
 Wally+0.152
 Sebastian+0.145
 fisher+0.137
Stan+0.134
Dave+0.133
Neg logit contributions
 ceiling-0.134
 roof-0.128
 wall-0.123
 biggest-0.119
 top-0.113
Token of
Token ID286
Feature activation-2.43e+00

Pos logit contributions
 Wally+0.117
 Sebastian+0.106
 Finn+0.105
 Otto+0.103
Stan+0.103
Neg logit contributions
est-0.087
ness-0.084
chair-0.075
 ceiling-0.071
ly-0.069
Token rock
Token ID3881
Feature activation-5.19e-01

Pos logit contributions
 Wally+0.092
 Sebastian+0.089
 fisher+0.088
Stan+0.086
 fisherman+0.081
Neg logit contributions
ness-0.088
 ceiling-0.087
appointed-0.084
chair-0.084
 walls-0.081
Token that
Token ID326
Feature activation-4.05e-01

Pos logit contributions
 Wally+0.020
 Sebastian+0.019
 tuna+0.019
 fisher+0.018
 shrimp+0.018
Neg logit contributions
chair-0.022
appointed-0.021
unct-0.019
irds-0.019
ring-0.019
Token comes
Token ID2058
Feature activation+5.86e-01

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 tuna+0.016
 fisher+0.015
 shrimp+0.015
Neg logit contributions
appointed-0.015
chair-0.015
unct-0.014
irds-0.013
 halls-0.013
TokenThese
Token ID4711
Feature activation-1.08e+00

Pos logit contributions
 Wally+0.019
 tuna+0.019
 shrimp+0.018
 fisher+0.018
 Sebastian+0.018
Neg logit contributions
appointed-0.018
chair-0.017
irds-0.015
clock-0.015
flash-0.015
Token are
Token ID389
Feature activation-2.75e+00

Pos logit contributions
 Wally+0.042
 Sebastian+0.039
 fisher+0.038
 Finn+0.038
 tuna+0.038
Neg logit contributions
appointed-0.040
chair-0.038
bows-0.037
cases-0.037
up-0.036
Token peanuts
Token ID45612
Feature activation-4.76e-01

Pos logit contributions
 fisher+0.104
 fisherman+0.098
 Sebastian+0.093
Dave+0.091
Patrick+0.090
Neg logit contributions
appointed-0.083
 roof-0.081
 biggest-0.081
 walls-0.081
clock-0.080
Token.
Token ID13
Feature activation-2.26e+00

Pos logit contributions
 Wally+0.016
 Sebastian+0.016
 tuna+0.015
 fisher+0.015
 shrimp+0.015
Neg logit contributions
chair-0.021
appointed-0.021
bows-0.019
irds-0.019
cases-0.019
Token They
Token ID1119
Feature activation-1.62e+00

Pos logit contributions
 fisher+0.095
 Wally+0.095
 shrimp+0.094
 tuna+0.093
 fisherman+0.092
Neg logit contributions
chair-0.072
appointed-0.070
clock-0.062
irds-0.061
 Lightning-0.060
Token are
Token ID389
Feature activation-3.79e+00

Pos logit contributions
 Wally+0.078
 Sebastian+0.075
 fisher+0.074
 tuna+0.073
 Finn+0.072
Neg logit contributions
appointed-0.046
bows-0.040
chair-0.039
unct-0.039
 roof-0.039
Token interesting
Token ID3499
Feature activation-1.58e+00

Pos logit contributions
 fisher+0.167
Patrick+0.163
 fisherman+0.159
Dave+0.159
inn+0.152
Neg logit contributions
 roof-0.094
 walls-0.093
 biggest-0.091
 pictures-0.086
 same-0.085
Token because
Token ID780
Feature activation-2.46e+00

Pos logit contributions
 Wally+0.061
 Sebastian+0.058
 fisher+0.055
 Otto+0.054
 tuna+0.052
Neg logit contributions
chair-0.060
appointed-0.057
 halls-0.056
irds-0.056
clock-0.055
Token they
Token ID484
Feature activation-1.87e+00

Pos logit contributions
 fisher+0.099
 Wally+0.097
 fossil+0.093
 Sebastian+0.090
 Otto+0.090
Neg logit contributions
chair-0.079
appointed-0.076
 roof-0.075
 halls-0.072
filled-0.072
Token grow
Token ID1663
Feature activation-5.91e-01

Pos logit contributions
 Wally+0.090
 Sebastian+0.087
 fisher+0.085
 tuna+0.083
 fisherman+0.082
Neg logit contributions
appointed-0.052
bows-0.049
chair-0.045
unct-0.045
 halls-0.045
Token underground
Token ID11447
Feature activation-5.49e-01

Pos logit contributions
 Wally+0.025
 Sebastian+0.024
 tuna+0.024
 fisher+0.023
 shrimp+0.023
Neg logit contributions
appointed-0.022
chair-0.020
irds-0.019
cases-0.019
 halls-0.019
Token basket
Token ID7988
Feature activation+2.38e-01

Pos logit contributions
appointed+0.003
chair+0.003
irds+0.002
unct+0.002
 halls+0.002
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 shrimp-0.003
 fisher-0.003
Token,
Token ID11
Feature activation-2.42e-01

Pos logit contributions
appointed+0.011
chair+0.011
unct+0.010
irds+0.010
 halls+0.010
Neg logit contributions
 Wally-0.008
 tuna-0.008
 shrimp-0.007
 Sebastian-0.007
 fisher-0.007
Token Jack
Token ID3619
Feature activation-1.13e+00

Pos logit contributions
 Wally+0.007
 tuna+0.007
 Sebastian+0.007
 fisher+0.007
 shrimp+0.007
Neg logit contributions
appointed-0.012
chair-0.012
irds-0.011
 halls-0.011
cases-0.011
Token found
Token ID1043
Feature activation-8.53e-01

Pos logit contributions
 Wally+0.042
 Sebastian+0.039
 tuna+0.039
 fisher+0.039
 shrimp+0.038
Neg logit contributions
chair-0.046
appointed-0.046
 halls-0.042
unct-0.041
cases-0.041
Token delicious
Token ID12625
Feature activation-1.39e+00

Pos logit contributions
 fisher+0.037
 Wally+0.036
 Sebastian+0.034
 tuna+0.034
 shrimp+0.034
Neg logit contributions
chair-0.027
appointed-0.026
cases-0.024
filled-0.024
 halls-0.024
Token sandwiches
Token ID29777
Feature activation-3.79e+00

Pos logit contributions
 Wally+0.064
 Sebastian+0.061
 fisher+0.061
 Finn+0.058
 Fin+0.058
Neg logit contributions
chair-0.041
appointed-0.039
 halls-0.038
bows-0.037
 floors-0.036
Token,
Token ID11
Feature activation-1.98e+00

Pos logit contributions
 fisher+0.178
 Sebastian+0.168
 fossil+0.167
 Crush+0.160
Fin+0.157
Neg logit contributions
chair-0.113
cakes-0.096
case-0.095
cases-0.092
als-0.091
Token apples
Token ID22514
Feature activation-2.84e+00

Pos logit contributions
 fisher+0.079
 Sebastian+0.076
 Wally+0.075
 fossil+0.073
 Finn+0.072
Neg logit contributions
chair-0.069
appointed-0.066
cases-0.064
 walls-0.063
 halls-0.062
Token,
Token ID11
Feature activation-1.81e+00

Pos logit contributions
 fisher+0.124
 Sebastian+0.118
 fossil+0.116
 Finn+0.114
 Wally+0.113
Neg logit contributions
chair-0.081
ics-0.080
cases-0.076
 roof-0.076
bows-0.075
Token and
Token ID290
Feature activation-4.12e-01

Pos logit contributions
 Sebastian+0.062
 Wally+0.060
 fisher+0.060
 Finn+0.058
 fossil+0.057
Neg logit contributions
chair-0.073
cases-0.070
appointed-0.069
 walls-0.068
 roof-0.068
Token oranges
Token ID48389
Feature activation-1.94e+00

Pos logit contributions
 Wally+0.015
 Sebastian+0.015
 fisher+0.014
 tuna+0.014
 shrimp+0.014
Neg logit contributions
appointed-0.017
chair-0.017
 halls-0.015
cases-0.015
irds-0.015
Token regular
Token ID3218
Feature activation-1.55e+00

Pos logit contributions
 Sebastian+0.119
 Wally+0.113
 Otto+0.107
 Finn+0.105
 Hank+0.102
Neg logit contributions
clock-0.084
chair-0.082
bye-0.075
 job-0.074
est-0.073
Token pizza
Token ID14256
Feature activation-3.47e+00

Pos logit contributions
 Wally+0.058
 Sebastian+0.055
 Otto+0.053
 Finn+0.052
 Fin+0.049
Neg logit contributions
clock-0.064
chair-0.062
ness-0.059
appointed-0.058
est-0.058
Token.
Token ID13
Feature activation-1.45e+00

Pos logit contributions
 Sebastian+0.153
 Wally+0.141
 Otto+0.137
 shrimp+0.130
 Crush+0.130
Neg logit contributions
chair-0.113
clock-0.104
bye-0.099
bows-0.097
case-0.095
Token A
Token ID317
Feature activation-1.52e+00

Pos logit contributions
 fossil+0.055
 Sebastian+0.053
 Wally+0.052
 shrimp+0.051
 fisher+0.051
Neg logit contributions
chair-0.054
appointed-0.053
clock-0.048
 halls-0.046
 Everyone-0.045
Token regular
Token ID3218
Feature activation-1.69e+00

Pos logit contributions
 Wally+0.066
 Sebastian+0.065
 Finn+0.062
 Otto+0.061
 Fin+0.059
Neg logit contributions
chair-0.052
appointed-0.047
clock-0.047
 halls-0.046
 roof-0.046
Token pizza
Token ID14256
Feature activation-3.78e+00

Pos logit contributions
 Wally+0.065
 Sebastian+0.063
 Finn+0.061
 Otto+0.060
 Fin+0.058
Neg logit contributions
chair-0.068
ness-0.061
appointed-0.060
 halls-0.060
 roof-0.059
Token has
Token ID468
Feature activation-1.53e+00

Pos logit contributions
 Sebastian+0.177
 Otto+0.162
 Wally+0.162
 Finn+0.152
 Hank+0.149
Neg logit contributions
chair-0.117
clock-0.102
case-0.101
bye-0.098
cakes-0.098
Token cheese
Token ID9891
Feature activation-3.51e+00

Pos logit contributions
 Wally+0.065
 Sebastian+0.063
 Finn+0.060
 Otto+0.060
 fisher+0.058
Neg logit contributions
clock-0.051
chair-0.051
appointed-0.050
 halls-0.048
 ceiling-0.047
Token and
Token ID290
Feature activation-1.35e+00

Pos logit contributions
 Sebastian+0.141
 Wally+0.131
 Finn+0.127
 Otto+0.127
 Fin+0.119
Neg logit contributions
chair-0.122
cases-0.113
bows-0.112
 ceiling-0.111
case-0.111
Token sauce
Token ID10746
Feature activation-2.77e+00

Pos logit contributions
 Wally+0.045
 Sebastian+0.043
 fisher+0.040
 Finn+0.039
 Otto+0.039
Neg logit contributions
chair-0.058
appointed-0.057
clock-0.055
 halls-0.053
 ceiling-0.053
Token and
Token ID290
Feature activation-1.44e+00

Pos logit contributions
 Sebastian+0.115
 Wally+0.109
 fisher+0.103
 Otto+0.103
 fossil+0.102
Neg logit contributions
chair-0.096
cases-0.091
bows-0.088
ong-0.086
cakes-0.086
Token and
Token ID290
Feature activation-5.80e-01

Pos logit contributions
 Sebastian+0.051
 Wally+0.050
 Otto+0.044
 fisher+0.044
 tuna+0.044
Neg logit contributions
chair-0.056
appointed-0.054
 ceiling-0.053
 halls-0.052
 roof-0.051
Token look
Token ID804
Feature activation-1.15e+00

Pos logit contributions
 Wally+0.025
 Sebastian+0.025
 tuna+0.024
 fisher+0.024
 shrimp+0.023
Neg logit contributions
appointed-0.020
chair-0.019
 halls-0.019
 roof-0.018
unct-0.018
Token for
Token ID329
Feature activation-1.36e+00

Pos logit contributions
 Wally+0.050
 Sebastian+0.049
 Crab+0.047
 fisher+0.047
 tuna+0.046
Neg logit contributions
chair-0.039
appointed-0.039
 halls-0.037
ong-0.036
unct-0.036
Token bugs
Token ID11316
Feature activation-2.01e+00

Pos logit contributions
 Wally+0.061
 Sebastian+0.059
 fisher+0.055
 Otto+0.053
 fossil+0.052
Neg logit contributions
 halls-0.041
 walls-0.041
 floors-0.040
 roof-0.040
 ceiling-0.039
Token.
Token ID13
Feature activation-1.17e+00

Pos logit contributions
 Sebastian+0.097
 Wally+0.090
 fisher+0.088
 Otto+0.085
 tuna+0.084
Neg logit contributions
chair-0.064
bows-0.057
 halls-0.055
ring-0.055
cases-0.054
Token Ben
Token ID3932
Feature activation-3.77e+00

Pos logit contributions
 fossil+0.047
 fisher+0.045
 Sebastian+0.044
 Wally+0.044
 tuna+0.044
Neg logit contributions
appointed-0.043
chair-0.039
irds-0.038
 halls-0.037
rolling-0.036
Token finds
Token ID7228
Feature activation-8.28e-01

Pos logit contributions
 Sebastian+0.210
 Wally+0.186
 Crab+0.186
 fisher+0.183
 Crush+0.182
Neg logit contributions
's-0.081
chair-0.079
 halls-0.077
 starts-0.069
 sounds-0.068
Token a
Token ID257
Feature activation-1.98e+00

Pos logit contributions
 Wally+0.035
 Sebastian+0.035
 fisher+0.034
 tuna+0.032
 shrimp+0.032
Neg logit contributions
chair-0.027
appointed-0.026
 halls-0.025
irds-0.024
up-0.024
Token big
Token ID1263
Feature activation-2.45e+00

Pos logit contributions
 Sebastian+0.098
 Wally+0.098
 fisher+0.091
 Fin+0.088
 Finn+0.088
Neg logit contributions
 roof-0.054
 ceiling-0.052
chair-0.048
 stair-0.047
appointed-0.045
Token worm
Token ID23849
Feature activation-2.57e+00

Pos logit contributions
 Sebastian+0.115
 Wally+0.113
 fisher+0.105
Stan+0.103
 Otto+0.103
Neg logit contributions
 roof-0.076
 ceiling-0.074
 stair-0.071
 wall-0.069
chair-0.068
Token and
Token ID290
Feature activation-1.42e+00

Pos logit contributions
 Sebastian+0.116
 fisher+0.111
 Wally+0.106
Dave+0.098
 Finn+0.097
Neg logit contributions
chair-0.095
age-0.083
case-0.082
 roof-0.079
 ceiling-0.078
Token,
Token ID11
Feature activation-1.07e+00

Pos logit contributions
 Wally+0.007
 tuna+0.007
 Sebastian+0.006
 shrimp+0.006
 fisher+0.006
Neg logit contributions
appointed-0.010
chair-0.009
irds-0.009
unct-0.009
ring-0.009
Token "
Token ID366
Feature activation-3.60e-01

Pos logit contributions
 Wally+0.043
 tuna+0.043
 shrimp+0.041
 fisher+0.041
 Sebastian+0.041
Neg logit contributions
appointed-0.043
chair-0.038
ring-0.036
amas-0.036
irds-0.035
TokenIt
Token ID1026
Feature activation-9.18e-01

Pos logit contributions
 Wally+0.015
 tuna+0.014
 shrimp+0.014
 Sebastian+0.014
 fisher+0.014
Neg logit contributions
appointed-0.014
chair-0.013
irds-0.013
ring-0.012
unct-0.012
Token's
Token ID338
Feature activation-2.11e+00

Pos logit contributions
 Wally+0.053
 Sebastian+0.051
 shrimp+0.050
 fisher+0.050
 tuna+0.049
Neg logit contributions
appointed-0.019
chair-0.017
unct-0.014
irds-0.014
ring-0.013
Token a
Token ID257
Feature activation-3.00e+00

Pos logit contributions
 fisher+0.093
 Wally+0.089
 fisherman+0.087
 Sebastian+0.086
 Finn+0.084
Neg logit contributions
appointed-0.063
clock-0.056
 ceiling-0.049
 walls-0.049
ness-0.048
Token tuna
Token ID38883
Feature activation-3.77e+00

Pos logit contributions
 Wally+0.111
 Sebastian+0.108
 Finn+0.102
 Fin+0.098
 chefs+0.097
Neg logit contributions
 whole-0.095
 biggest-0.092
appointed-0.091
 ceiling-0.091
 same-0.090
Token,
Token ID11
Feature activation-3.06e+00

Pos logit contributions
 Sebastian+0.129
 Wally+0.119
 Fin+0.108
Phill+0.103
 Otto+0.103
Neg logit contributions
chair-0.132
 ceiling-0.130
ness-0.129
est-0.128
als-0.127
Token and
Token ID290
Feature activation-2.03e+00

Pos logit contributions
 fisher+0.109
 Wally+0.100
 Sebastian+0.096
inn+0.094
 fossil+0.093
Neg logit contributions
chair-0.103
 roof-0.098
 ceiling-0.098
irds-0.096
ring-0.094
Token we
Token ID356
Feature activation-2.65e+00

Pos logit contributions
 Wally+0.081
 Sebastian+0.079
 fisher+0.078
 Finn+0.072
 Otto+0.071
Neg logit contributions
chair-0.066
appointed-0.065
 ceiling-0.059
irds-0.059
 roof-0.059
Token can
Token ID460
Feature activation-2.56e+00

Pos logit contributions
 Wally+0.124
 tuna+0.115
 Sebastian+0.114
 fisher+0.112
 Otto+0.112
Neg logit contributions
appointed-0.072
ring-0.068
 roof-0.066
unct-0.062
 ceiling-0.062
Token catch
Token ID4929
Feature activation-1.50e+00

Pos logit contributions
 fossil+0.115
 Wally+0.114
 chefs+0.113
 tuna+0.112
 fisher+0.111
Neg logit contributions
appointed-0.065
rolling-0.057
 roof-0.054
 top-0.054
 doorstep-0.054
Token to
Token ID284
Feature activation+5.82e-01

Pos logit contributions
appointed+0.008
chair+0.007
unct+0.007
irds+0.007
 halls+0.007
Neg logit contributions
 Wally-0.008
 tuna-0.008
 Sebastian-0.008
 shrimp-0.007
 fisher-0.007
Token play
Token ID711
Feature activation-1.40e+00

Pos logit contributions
appointed+0.023
chair+0.022
unct+0.020
irds+0.020
flash+0.019
Neg logit contributions
 Wally-0.024
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token with
Token ID351
Feature activation-4.51e-01

Pos logit contributions
 Sebastian+0.056
 Wally+0.053
 fossil+0.049
 Crab+0.049
 fisher+0.049
Neg logit contributions
chair-0.051
appointed-0.047
amas-0.047
 halls-0.046
 walls-0.046
Token toys
Token ID14958
Feature activation-2.00e+00

Pos logit contributions
 Wally+0.018
 Sebastian+0.018
 fisher+0.017
 tuna+0.017
 Crab+0.016
Neg logit contributions
chair-0.017
appointed-0.016
 halls-0.016
flash-0.015
irds-0.015
Token.
Token ID13
Feature activation-1.16e+00

Pos logit contributions
 Sebastian+0.101
 Wally+0.095
 Otto+0.091
 fisher+0.090
 Crab+0.089
Neg logit contributions
chair-0.065
bows-0.054
 ceiling-0.052
 halls-0.051
cases-0.051
Token Ben
Token ID3932
Feature activation-3.77e+00

Pos logit contributions
 Sebastian+0.061
 fossil+0.060
 fisher+0.060
 Wally+0.059
 tuna+0.058
Neg logit contributions
appointed-0.028
chair-0.027
 halls-0.023
clock-0.023
 ceiling-0.022
Token has
Token ID468
Feature activation-2.03e-02

Pos logit contributions
 Sebastian+0.232
 Crab+0.213
 Wally+0.211
 fisher+0.208
 pearl+0.208
Neg logit contributions
's-0.062
chair-0.059
 halls-0.054
 starts-0.044
 sounds-0.043
Token a
Token ID257
Feature activation-1.51e+00

Pos logit contributions
 Wally+0.001
 tuna+0.001
 Sebastian+0.001
 shrimp+0.001
 fisher+0.001
Neg logit contributions
appointed-0.001
chair-0.001
 halls-0.001
irds-0.001
unct-0.001
Token new
Token ID649
Feature activation-1.26e+00

Pos logit contributions
 Wally+0.077
 Sebastian+0.077
 fisher+0.073
 Finn+0.072
 Otto+0.071
Neg logit contributions
chair-0.039
 roof-0.038
 ceiling-0.035
appointed-0.033
 stair-0.032
Token toy
Token ID13373
Feature activation-2.20e+00

Pos logit contributions
 Wally+0.060
 Sebastian+0.057
 fisher+0.056
 Otto+0.055
 Finn+0.055
Neg logit contributions
chair-0.040
 roof-0.037
appointed-0.037
 ceiling-0.035
bows-0.033
Token.
Token ID13
Feature activation-1.50e+00

Pos logit contributions
 Wally+0.100
 Sebastian+0.097
 fisher+0.092
 Otto+0.092
Stan+0.091
Neg logit contributions
chair-0.079
 roof-0.069
 ceiling-0.066
clock-0.061
cases-0.060
Tokenily
Token ID813
Feature activation+4.64e-02

Pos logit contributions
appointed+0.004
chair+0.004
unct+0.004
irds+0.004
 halls+0.004
Neg logit contributions
 Wally-0.004
 tuna-0.004
 Sebastian-0.004
 shrimp-0.004
 fisher-0.004
Token offered
Token ID4438
Feature activation+3.65e-01

Pos logit contributions
appointed+0.002
chair+0.002
unct+0.002
irds+0.002
 halls+0.002
Neg logit contributions
 Wally-0.002
 tuna-0.002
 Sebastian-0.001
 shrimp-0.001
 fisher-0.001
Token Tim
Token ID5045
Feature activation-3.65e-02

Pos logit contributions
appointed+0.015
chair+0.015
irds+0.013
unct+0.013
 halls+0.013
Neg logit contributions
 Wally-0.015
 tuna-0.014
 shrimp-0.014
 Sebastian-0.014
 fisher-0.013
Tokenmy
Token ID1820
Feature activation-3.40e-01

Pos logit contributions
 Wally+0.002
 tuna+0.002
 Sebastian+0.002
 shrimp+0.002
 fisher+0.001
Neg logit contributions
appointed-0.001
chair-0.001
unct-0.001
irds-0.001
 halls-0.001
Token some
Token ID617
Feature activation-1.11e+00

Pos logit contributions
 Wally+0.013
 Sebastian+0.012
 tuna+0.012
 fisher+0.012
 shrimp+0.012
Neg logit contributions
appointed-0.014
chair-0.013
unct-0.012
irds-0.012
 halls-0.012
Token food
Token ID2057
Feature activation-3.76e+00

Pos logit contributions
 Wally+0.045
 Sebastian+0.044
 fisher+0.041
 Finn+0.040
 Otto+0.039
Neg logit contributions
chair-0.042
appointed-0.040
irds-0.036
 halls-0.036
unct-0.036
Token,
Token ID11
Feature activation-2.21e+00

Pos logit contributions
 Sebastian+0.157
 fisher+0.156
 fossil+0.150
 Wally+0.146
 fisherman+0.137
Neg logit contributions
chair-0.120
cakes-0.106
age-0.102
bows-0.101
ring-0.101
Token but
Token ID475
Feature activation-1.76e-01

Pos logit contributions
 fisher+0.083
 fossil+0.080
 Sebastian+0.077
 Wally+0.076
 shark+0.074
Neg logit contributions
chair-0.083
appointed-0.082
bars-0.075
 roof-0.074
irds-0.072
Token he
Token ID339
Feature activation-1.34e+00

Pos logit contributions
 Wally+0.007
 tuna+0.006
 shrimp+0.006
 fisher+0.006
 Sebastian+0.006
Neg logit contributions
appointed-0.007
chair-0.007
unct-0.007
irds-0.006
 halls-0.006
Token said
Token ID531
Feature activation-7.48e-01

Pos logit contributions
 Wally+0.059
 tuna+0.058
 fisher+0.058
 shrimp+0.058
 Sebastian+0.057
Neg logit contributions
appointed-0.044
chair-0.042
unct-0.037
up-0.035
owed-0.035
Token no
Token ID645
Feature activation+2.34e-01

Pos logit contributions
 Wally+0.026
 tuna+0.025
 fisher+0.025
 Sebastian+0.025
 fossil+0.024
Neg logit contributions
appointed-0.031
chair-0.031
irds-0.028
unct-0.028
ring-0.027
Token It
Token ID632
Feature activation-1.95e+00

Pos logit contributions
 Wally+0.120
 shrimp+0.117
 Sebastian+0.114
 pearl+0.112
 guarding+0.112
Neg logit contributions
clock-0.076
 Everyone-0.067
appointed-0.067
EEP-0.066
chair-0.064
Token has
Token ID468
Feature activation-1.87e+00

Pos logit contributions
 Sebastian+0.102
 Wally+0.099
 shrimp+0.092
 fisher+0.091
 Otto+0.090
Neg logit contributions
chair-0.045
appointed-0.044
clock-0.044
 sounds-0.039
bows-0.039
Token a
Token ID257
Feature activation-3.05e+00

Pos logit contributions
 Wally+0.072
 Sebastian+0.072
 Finn+0.066
 Otto+0.066
 fisher+0.065
Neg logit contributions
clock-0.066
appointed-0.064
 ceiling-0.063
chair-0.063
 halls-0.061
Token lot
Token ID1256
Feature activation-2.20e+00

Pos logit contributions
 Wally+0.142
 Sebastian+0.140
 Finn+0.136
 Fin+0.132
 Otto+0.130
Neg logit contributions
 roof-0.084
 ceiling-0.083
 wall-0.076
appointed-0.075
clock-0.075
Token of
Token ID286
Feature activation-2.24e+00

Pos logit contributions
 Sebastian+0.124
 Wally+0.120
 Finn+0.112
 Otto+0.112
 Fin+0.111
Neg logit contributions
chair-0.049
up-0.044
ong-0.043
clock-0.043
appointed-0.039
Token cheese
Token ID9891
Feature activation-3.76e+00

Pos logit contributions
 Wally+0.099
 Sebastian+0.098
 Finn+0.091
 Otto+0.090
 fisher+0.089
Neg logit contributions
appointed-0.067
ness-0.067
 walls-0.066
 floors-0.066
 sides-0.064
Token and
Token ID290
Feature activation-1.86e+00

Pos logit contributions
 Sebastian+0.169
 Wally+0.156
 Otto+0.145
 Fin+0.144
 Finn+0.142
Neg logit contributions
chair-0.112
 ceiling-0.109
ness-0.108
 walls-0.104
ong-0.102
Token sauce
Token ID10746
Feature activation-2.99e+00

Pos logit contributions
 Sebastian+0.061
 Wally+0.060
 fisher+0.054
 Finn+0.053
 Otto+0.052
Neg logit contributions
chair-0.077
appointed-0.075
clock-0.074
 ceiling-0.073
 halls-0.071
Token and
Token ID290
Feature activation-1.92e+00

Pos logit contributions
 Sebastian+0.136
 Wally+0.129
 Fin+0.121
 Otto+0.119
 fisher+0.117
Neg logit contributions
chair-0.096
 ceiling-0.086
ong-0.085
cases-0.084
ring-0.084
Token bread
Token ID8509
Feature activation-3.32e+00

Pos logit contributions
 Sebastian+0.062
 Wally+0.062
 fisher+0.055
 Finn+0.054
 Otto+0.054
Neg logit contributions
chair-0.079
appointed-0.077
clock-0.076
 ceiling-0.075
ness-0.074
Token.
Token ID13
Feature activation-2.21e+00

Pos logit contributions
 Sebastian+0.148
 Wally+0.142
 Otto+0.131
 fisher+0.129
Bruce+0.126
Neg logit contributions
ness-0.094
ring-0.091
als-0.090
bows-0.090
ong-0.089
Token,
Token ID11
Feature activation-1.03e+00

Pos logit contributions
 Sebastian+0.072
 Wally+0.071
 fisher+0.069
 fossil+0.069
 Finn+0.065
Neg logit contributions
chair-0.068
appointed-0.061
 ceiling-0.058
ness-0.056
 roof-0.055
Token so
Token ID523
Feature activation+1.23e-01

Pos logit contributions
 Wally+0.034
 fisher+0.034
 tuna+0.033
 fossil+0.033
 Sebastian+0.032
Neg logit contributions
appointed-0.047
chair-0.045
 halls-0.040
cases-0.039
irds-0.039
Token he
Token ID339
Feature activation-7.16e-01

Pos logit contributions
appointed+0.007
chair+0.007
unct+0.006
irds+0.006
 halls+0.006
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 shrimp-0.003
 fisher-0.003
Token looked
Token ID3114
Feature activation-1.25e+00

Pos logit contributions
 Wally+0.029
 Sebastian+0.029
 tuna+0.028
 fisher+0.028
 shrimp+0.027
Neg logit contributions
appointed-0.027
chair-0.026
 halls-0.024
unct-0.023
irds-0.023
Token for
Token ID329
Feature activation-9.66e-01

Pos logit contributions
 Sebastian+0.057
 Wally+0.055
 fisher+0.054
 tuna+0.051
 Crab+0.051
Neg logit contributions
appointed-0.043
chair-0.040
 halls-0.038
unct-0.037
cases-0.035
Token food
Token ID2057
Feature activation-3.76e+00

Pos logit contributions
 Wally+0.043
 Sebastian+0.041
 fisher+0.041
 fossil+0.038
 shrimp+0.038
Neg logit contributions
appointed-0.030
chair-0.029
 halls-0.028
filled-0.027
 roof-0.026
Token.
Token ID13
Feature activation-5.41e-01

Pos logit contributions
 Sebastian+0.190
 fossil+0.190
 fisher+0.188
 Crush+0.175
 salmon+0.175
Neg logit contributions
chair-0.071
cases-0.068
 grounds-0.061
als-0.061
seek-0.058
Token He
Token ID679
Feature activation-7.69e-01

Pos logit contributions
 Wally+0.019
 tuna+0.019
 shrimp+0.019
 fisher+0.018
 Sebastian+0.018
Neg logit contributions
appointed-0.024
chair-0.023
irds-0.021
 halls-0.021
flash-0.020
Token found
Token ID1043
Feature activation-3.48e-01

Pos logit contributions
 Wally+0.030
 Sebastian+0.030
 fisher+0.029
 tuna+0.029
 shrimp+0.028
Neg logit contributions
appointed-0.030
chair-0.029
 halls-0.028
unct-0.026
irds-0.025
Token a
Token ID257
Feature activation-2.56e+00

Pos logit contributions
 Wally+0.014
 Sebastian+0.013
 fisher+0.013
 tuna+0.013
 shrimp+0.013
Neg logit contributions
appointed-0.013
chair-0.013
unct-0.012
 halls-0.011
irds-0.011
Token shell
Token ID7582
Feature activation-2.01e+00

Pos logit contributions
 Sebastian+0.140
 Wally+0.137
 Finn+0.131
 Fin+0.130
 fisher+0.129
Neg logit contributions
 roof-0.057
 ceiling-0.053
 stair-0.045
chair-0.044
 whole-0.044
Token Tim
Token ID5045
Feature activation+5.49e-01

Pos logit contributions
appointed+0.012
chair+0.011
unct+0.010
irds+0.010
 halls+0.010
Neg logit contributions
 Wally-0.008
 tuna-0.008
 shrimp-0.007
 Sebastian-0.007
 fisher-0.007
Token wanted
Token ID2227
Feature activation-4.35e-01

Pos logit contributions
appointed+0.020
chair+0.019
irds+0.017
unct+0.017
 halls+0.016
Neg logit contributions
 Wally-0.025
 tuna-0.024
 Sebastian-0.023
 shrimp-0.023
 fisher-0.023
Token to
Token ID284
Feature activation+5.75e-01

Pos logit contributions
 Wally+0.020
 Sebastian+0.019
 fisher+0.019
 tuna+0.018
 shrimp+0.018
Neg logit contributions
chair-0.014
appointed-0.014
 halls-0.012
clock-0.012
unct-0.012
Token find
Token ID1064
Feature activation-1.34e+00

Pos logit contributions
appointed+0.022
chair+0.021
unct+0.019
irds+0.019
flash+0.018
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.023
Token some
Token ID617
Feature activation-2.06e+00

Pos logit contributions
 Wally+0.056
 fisher+0.055
 Sebastian+0.052
 shrimp+0.051
 fossil+0.050
Neg logit contributions
chair-0.045
up-0.040
appointed-0.039
flash-0.039
 Lightning-0.038
Token food
Token ID2057
Feature activation-3.76e+00

Pos logit contributions
 Sebastian+0.092
 Wally+0.092
 fisher+0.088
 Crab+0.086
 Finn+0.085
Neg logit contributions
chair-0.065
 roof-0.056
 ceiling-0.055
appointed-0.055
 halls-0.054
Token,
Token ID11
Feature activation-1.61e+00

Pos logit contributions
 fisher+0.176
 Sebastian+0.171
 fossil+0.168
 fisherman+0.158
 Crush+0.157
Neg logit contributions
chair-0.096
case-0.080
cases-0.080
ics-0.076
 roof-0.074
Token so
Token ID523
Feature activation+9.53e-01

Pos logit contributions
 fisher+0.068
 Wally+0.068
 fossil+0.066
 Sebastian+0.064
 fisherman+0.062
Neg logit contributions
chair-0.055
appointed-0.052
cases-0.048
 roof-0.048
ong-0.047
Token he
Token ID339
Feature activation-3.05e-01

Pos logit contributions
appointed+0.060
chair+0.057
unct+0.055
amas+0.055
ring+0.055
Neg logit contributions
 Wally-0.018
 tuna-0.017
 Sebastian-0.016
 shrimp-0.016
 Fin-0.014
Token hopped
Token ID45230
Feature activation-1.62e+00

Pos logit contributions
 Wally+0.012
 tuna+0.011
 Sebastian+0.011
 shrimp+0.011
 fisher+0.011
Neg logit contributions
appointed-0.013
chair-0.012
unct-0.011
irds-0.011
 halls-0.011
Token around
Token ID1088
Feature activation-2.02e+00

Pos logit contributions
 fisher+0.079
 fossil+0.077
 tuna+0.077
 Sebastian+0.076
 Wally+0.076
Neg logit contributions
appointed-0.048
chair-0.037
filled-0.036
unct-0.034
up-0.033
Token.
Token ID13
Feature activation+3.69e-02

Pos logit contributions
 Wally+0.016
 Sebastian+0.016
 tuna+0.015
 shrimp+0.015
 fisher+0.015
Neg logit contributions
appointed-0.018
chair-0.017
 halls-0.016
irds-0.015
clock-0.015
Token He
Token ID679
Feature activation-7.14e-01

Pos logit contributions
appointed+0.002
chair+0.002
irds+0.002
unct+0.002
 halls+0.002
Neg logit contributions
 Wally-0.001
 tuna-0.001
 shrimp-0.001
 Sebastian-0.001
 fisher-0.001
Token found
Token ID1043
Feature activation-2.95e-01

Pos logit contributions
 Wally+0.029
 Sebastian+0.028
 fisher+0.027
 tuna+0.027
 shrimp+0.027
Neg logit contributions
appointed-0.027
chair-0.026
 halls-0.024
unct-0.023
irds-0.023
Token some
Token ID617
Feature activation-2.08e+00

Pos logit contributions
 Wally+0.012
 Sebastian+0.011
 tuna+0.011
 fisher+0.011
 shrimp+0.011
Neg logit contributions
appointed-0.011
chair-0.011
unct-0.010
irds-0.010
 halls-0.010
Token plastic
Token ID7309
Feature activation-3.07e+00

Pos logit contributions
 Sebastian+0.099
 Wally+0.095
 fisher+0.094
 Crab+0.090
 Finn+0.089
Neg logit contributions
chair-0.055
appointed-0.055
 walls-0.053
 roof-0.052
est-0.052
Token containers
Token ID16472
Feature activation-3.75e+00

Pos logit contributions
 Sebastian+0.130
 Wally+0.123
Dave+0.112
 fisher+0.111
Stan+0.111
Neg logit contributions
 walls-0.101
 roof-0.101
als-0.101
 ceiling-0.097
chair-0.096
Token.
Token ID13
Feature activation-7.04e-01

Pos logit contributions
 Sebastian+0.179
 fisher+0.175
 Wally+0.169
 fossil+0.166
 Finn+0.164
Neg logit contributions
chair-0.106
cakes-0.093
case-0.092
cases-0.090
 walls-0.088
Token He
Token ID679
Feature activation-1.33e+00

Pos logit contributions
 tuna+0.025
 shrimp+0.025
 Sebastian+0.025
 fisher+0.024
 Wally+0.024
Neg logit contributions
appointed-0.030
chair-0.028
irds-0.026
 halls-0.024
amas-0.024
Token thought
Token ID1807
Feature activation-1.56e+00

Pos logit contributions
 Wally+0.056
 Sebastian+0.055
 fisher+0.053
 tuna+0.051
 shrimp+0.050
Neg logit contributions
appointed-0.047
chair-0.045
 halls-0.044
cases-0.040
irds-0.040
Token they
Token ID484
Feature activation-3.51e-01

Pos logit contributions
 Sebastian+0.061
 fisher+0.060
 Wally+0.056
 fossil+0.055
 tuna+0.053
Neg logit contributions
chair-0.059
 roof-0.051
ong-0.051
appointed-0.051
 halls-0.051
Token looked
Token ID3114
Feature activation-2.56e+00

Pos logit contributions
 Wally+0.015
 tuna+0.014
 Sebastian+0.014
 fisher+0.014
 shrimp+0.014
Neg logit contributions
appointed-0.013
chair-0.013
unct-0.012
 halls-0.011
irds-0.011
Token,
Token ID11
Feature activation+2.44e+00

Pos logit contributions
appointed+0.092
irds+0.089
chair+0.088
flash+0.083
cases+0.082
Neg logit contributions
 tuna-0.038
 shrimp-0.036
 Wally-0.035
 Sebastian-0.031
 fisher-0.028
Token she
Token ID673
Feature activation+2.12e+00

Pos logit contributions
appointed+0.126
irds+0.117
EEP+0.116
amas+0.114
ring+0.113
Neg logit contributions
 Wally-0.072
 shrimp-0.071
 tuna-0.070
 like-0.065
 named-0.064
Token would
Token ID561
Feature activation+2.40e+00

Pos logit contributions
appointed+0.117
chair+0.104
irds+0.104
amas+0.103
unct+0.100
Neg logit contributions
 named-0.055
 shrimp-0.053
 tuna-0.052
 Wally-0.050
 fisher-0.046
Token sing
Token ID1702
Feature activation+2.25e+00

Pos logit contributions
appointed+0.094
chair+0.089
irds+0.086
cases+0.086
flash+0.085
Neg logit contributions
 Wally-0.090
 shrimp-0.088
 tuna-0.088
 named-0.087
 like-0.087
Token a
Token ID257
Feature activation+2.48e+00

Pos logit contributions
appointed+0.136
unct+0.130
bars+0.125
irds+0.124
chair+0.123
Neg logit contributions
 like-0.050
 tuna-0.044
 Wally-0.043
 his-0.038
 shrimp-0.038
Token happy
Token ID3772
Feature activation+3.16e+00

Pos logit contributions
appointed+0.098
cases+0.089
ance+0.089
bars+0.088
unct+0.087
Neg logit contributions
 tuna-0.105
 shrimp-0.103
 fish-0.096
 Wally-0.095
 fisher-0.092
Token tune
Token ID14009
Feature activation+2.37e+00

Pos logit contributions
appointed+0.138
bars+0.135
cases+0.135
limits+0.124
flash+0.123
Neg logit contributions
 tuna-0.125
 shrimp-0.122
 named-0.114
 fish-0.113
 Wally-0.109
Token.
Token ID13
Feature activation+1.48e+00

Pos logit contributions
appointed+0.143
unct+0.132
 halls+0.126
chair+0.126
flash+0.126
Neg logit contributions
 tuna-0.044
 Wally-0.044
 named-0.042
 shrimp-0.041
 like-0.041
Token One
Token ID1881
Feature activation+2.45e+00

Pos logit contributions
appointed+0.082
unct+0.079
cases+0.076
irds+0.074
chair+0.073
Neg logit contributions
 Wally-0.039
 tuna-0.036
 Sebastian-0.034
 Finn-0.033
 Fin-0.032
Token day
Token ID1110
Feature activation+9.65e-01

Pos logit contributions
cases+0.128
appointed+0.125
amas+0.122
unct+0.116
rolling+0.115
Neg logit contributions
 named-0.081
 tuna-0.078
 shrimp-0.076
 Wally-0.071
 fish-0.070
Token she
Token ID673
Feature activation+1.32e+00

Pos logit contributions
appointed+0.059
chair+0.053
unct+0.053
clock+0.052
amas+0.052
Neg logit contributions
 Wally-0.022
 tuna-0.021
 shrimp-0.020
 Sebastian-0.019
 Finn-0.018
Token They
Token ID1119
Feature activation+1.71e-01

Pos logit contributions
appointed+0.003
chair+0.003
irds+0.002
unct+0.002
 halls+0.002
Neg logit contributions
 Wally-0.002
 tuna-0.002
 shrimp-0.002
 Sebastian-0.002
 fisher-0.002
Token played
Token ID2826
Feature activation-4.67e-01

Pos logit contributions
appointed+0.007
chair+0.007
unct+0.006
irds+0.006
 halls+0.006
Neg logit contributions
 Wally-0.007
 tuna-0.006
 shrimp-0.006
 Sebastian-0.006
 fisher-0.006
Token hide
Token ID7808
Feature activation+8.89e-02

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 tuna+0.016
 shrimp+0.016
 fisher+0.015
Neg logit contributions
appointed-0.020
chair-0.020
 halls-0.018
irds-0.018
amas-0.017
Token-
Token ID12
Feature activation+3.88e+00

Pos logit contributions
appointed+0.005
chair+0.005
irds+0.005
unct+0.005
 halls+0.004
Neg logit contributions
 Wally-0.002
 tuna-0.002
 Sebastian-0.002
 shrimp-0.002
 fisher-0.002
Tokenand
Token ID392
Feature activation+2.85e+00

Pos logit contributions
appointed+0.189
amas+0.162
unct+0.156
oga+0.155
 ceiling+0.155
Neg logit contributions
 shrimp-0.121
 tuna-0.120
 Wally-0.115
Fin-0.113
 fisher-0.113
Token-
Token ID12
Feature activation+3.44e+00

Pos logit contributions
appointed+0.156
amas+0.131
oga+0.129
bars+0.128
 ceiling+0.126
Neg logit contributions
 shrimp-0.091
 tuna-0.088
 Wally-0.084
 fish-0.082
 Finn-0.079
Tokenseek
Token ID36163
Feature activation-6.71e-01

Pos logit contributions
appointed+0.115
amas+0.102
 ceiling+0.100
urious+0.099
olk+0.098
Neg logit contributions
 Wally-0.148
 named-0.138
Fin-0.138
 tuna-0.137
 fisher-0.137
Token and
Token ID290
Feature activation+6.75e-01

Pos logit contributions
 Wally+0.021
 Sebastian+0.021
 tuna+0.020
 shrimp+0.019
 fisher+0.019
Neg logit contributions
appointed-0.032
chair-0.032
 halls-0.030
irds-0.029
bows-0.029
Token jumped
Token ID11687
Feature activation+2.20e-01

Pos logit contributions
appointed+0.033
chair+0.031
irds+0.030
unct+0.029
amas+0.029
Neg logit contributions
 Wally-0.023
 tuna-0.021
 shrimp-0.020
 Sebastian-0.020
 Finn-0.020
Token in
Token ID287
Feature activation-6.40e-01

Pos logit contributions
appointed+0.010
chair+0.010
unct+0.009
irds+0.009
 halls+0.009
Neg logit contributions
 Wally-0.007
 tuna-0.007
 shrimp-0.007
 Sebastian-0.007
 fisher-0.007
Token p
Token ID279
Feature activation+1.83e+00

Pos logit contributions
 Wally+0.025
 Sebastian+0.024
 fisher+0.024
 shrimp+0.023
 tuna+0.023
Neg logit contributions
chair-0.025
appointed-0.024
 halls-0.023
 ceiling-0.022
 roof-0.022
Token so
Token ID523
Feature activation+1.67e+00

Pos logit contributions
appointed+0.048
irds+0.047
chair+0.047
bows+0.044
unct+0.044
Neg logit contributions
 Wally-0.032
 tuna-0.031
 shrimp-0.029
 Sebastian-0.029
 Finn-0.027
Token sad
Token ID6507
Feature activation+1.50e-03

Pos logit contributions
chair+0.082
irds+0.080
bows+0.080
aries+0.079
appointed+0.079
Neg logit contributions
 Wally-0.050
 shrimp-0.047
 tuna-0.046
 fisher-0.044
 Sebastian-0.043
Token but
Token ID475
Feature activation+1.11e+00

Pos logit contributions
appointed+0.000
chair+0.000
unct+0.000
irds+0.000
 halls+0.000
Neg logit contributions
 Wally-0.000
 tuna-0.000
 Sebastian-0.000
 shrimp-0.000
 fisher-0.000
Token then
Token ID788
Feature activation+1.56e+00

Pos logit contributions
appointed+0.058
amas+0.058
irds+0.058
chair+0.057
bows+0.056
Neg logit contributions
 Wally-0.031
 tuna-0.027
 shrimp-0.026
 Sebastian-0.025
 Finn-0.024
Token she
Token ID673
Feature activation+7.20e-01

Pos logit contributions
appointed+0.069
amas+0.064
irds+0.064
chair+0.063
bows+0.062
Neg logit contributions
 Wally-0.058
 tuna-0.055
 shrimp-0.055
 named-0.054
 Sebastian-0.052
Token heard
Token ID2982
Feature activation+3.34e+00

Pos logit contributions
appointed+0.039
chair+0.037
irds+0.035
bows+0.035
amas+0.034
Neg logit contributions
 Wally-0.020
 tuna-0.019
 shrimp-0.018
 Sebastian-0.018
 fisher-0.017
Token a
Token ID257
Feature activation+1.44e+00

Pos logit contributions
amas+0.236
appointed+0.235
akes+0.233
EEP+0.229
irds+0.228
Neg logit contributions
 his-0.043
 named-0.034
 who-0.031
 Wally-0.029
 tuna-0.029
Token lovely
Token ID14081
Feature activation+1.69e+00

Pos logit contributions
appointed+0.054
chair+0.051
irds+0.048
cases+0.047
bows+0.047
Neg logit contributions
 tuna-0.062
 Wally-0.061
 shrimp-0.060
 fisher-0.058
 fish-0.058
Token sound
Token ID2128
Feature activation+4.23e-02

Pos logit contributions
appointed+0.059
cases+0.057
bars+0.056
irds+0.053
chair+0.053
Neg logit contributions
 tuna-0.082
 shrimp-0.080
 Wally-0.078
 fish-0.075
 shark-0.074
Token.
Token ID13
Feature activation+2.25e+00

Pos logit contributions
appointed+0.002
chair+0.002
unct+0.002
irds+0.002
 halls+0.002
Neg logit contributions
 Wally-0.001
 tuna-0.001
 Sebastian-0.001
 shrimp-0.001
 fisher-0.001
Token It
Token ID632
Feature activation+6.75e-01

Pos logit contributions
unct+0.112
appointed+0.107
vation+0.101
arse+0.100
bars+0.099
Neg logit contributions
 Wally-0.076
 tuna-0.070
 shrimp-0.066
 Sebastian-0.063
 Fin-0.063
Token a
Token ID257
Feature activation+1.87e+00

Pos logit contributions
appointed+0.283
unct+0.273
amas+0.271
rated+0.269
arse+0.269
Neg logit contributions
 named-0.036
 his-0.034
 tuna-0.028
 who-0.027
 shrimp-0.022
Token song
Token ID3496
Feature activation+1.10e+00

Pos logit contributions
appointed+0.075
chair+0.068
irds+0.067
ance+0.066
bows+0.065
Neg logit contributions
 tuna-0.078
 shrimp-0.076
 fish-0.074
 Wally-0.073
 fisherman-0.072
Token coming
Token ID2406
Feature activation+2.14e+00

Pos logit contributions
appointed+0.060
unct+0.057
irds+0.056
chair+0.055
 halls+0.053
Neg logit contributions
 Wally-0.027
 tuna-0.026
 shrimp-0.024
 Sebastian-0.023
 named-0.023
Token from
Token ID422
Feature activation+6.33e-02

Pos logit contributions
appointed+0.149
chair+0.147
irds+0.141
amas+0.141
vation+0.141
Neg logit contributions
 Wally-0.016
 like-0.014
 named-0.012
 shrimp-0.011
 tuna-0.011
Token outside
Token ID2354
Feature activation+2.43e+00

Pos logit contributions
appointed+0.003
chair+0.003
unct+0.002
irds+0.002
 halls+0.002
Neg logit contributions
 Wally-0.002
 tuna-0.002
 Sebastian-0.002
 shrimp-0.002
 fisher-0.002
Token.
Token ID13
Feature activation+3.35e+00

Pos logit contributions
appointed+0.163
irds+0.158
unct+0.154
rolling+0.150
pared+0.147
Neg logit contributions
 his-0.034
 named-0.032
 and-0.028
 their-0.026
 who-0.025
Token It
Token ID632
Feature activation+1.51e+00

Pos logit contributions
unct+0.180
arse+0.173
vation+0.172
uch+0.163
ics+0.160
Neg logit contributions
 Wally-0.094
 tuna-0.090
 named-0.085
 shrimp-0.085
 Fin-0.079
Token was
Token ID373
Feature activation-3.46e-01

Pos logit contributions
appointed+0.080
irds+0.080
unct+0.078
amas+0.077
chair+0.077
Neg logit contributions
 Wally-0.038
 tuna-0.035
 named-0.034
 Sebastian-0.033
 shrimp-0.033
Token a
Token ID257
Feature activation+1.50e-01

Pos logit contributions
 Wally+0.015
 fisher+0.015
 tuna+0.015
 shrimp+0.014
 Sebastian+0.014
Neg logit contributions
appointed-0.013
chair-0.012
 halls-0.010
clock-0.010
unct-0.010
Token beautiful
Token ID4950
Feature activation+1.46e+00

Pos logit contributions
appointed+0.005
chair+0.005
unct+0.005
irds+0.005
 halls+0.004
Neg logit contributions
 Wally-0.007
 tuna-0.007
 Sebastian-0.006
 shrimp-0.006
 fisher-0.006
Token,
Token ID11
Feature activation+5.89e-01

Pos logit contributions
appointed+0.046
unct+0.041
cases+0.040
irds+0.038
flash+0.038
Neg logit contributions
 tuna-0.076
 shrimp-0.074
 Wally-0.074
 Sebastian-0.068
 shark-0.068
Token you
Token ID345
Feature activation+1.42e-01

Pos logit contributions
appointed+0.135
iece+0.129
amas+0.125
age+0.124
rated+0.123
Neg logit contributions
 Wally-0.087
 shrimp-0.084
 tuna-0.083
 Finn-0.078
 Otto-0.072
Token frightened
Token ID24776
Feature activation+9.72e-02

Pos logit contributions
appointed+0.006
chair+0.006
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.006
 tuna-0.005
 shrimp-0.005
 Sebastian-0.005
 fisher-0.005
Token?
Token ID30
Feature activation+1.01e+00

Pos logit contributions
appointed+0.005
chair+0.005
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.003
 tuna-0.002
 Sebastian-0.002
 shrimp-0.002
 fisher-0.002
Tokenâ
Token ID22940
Feature activation+3.45e-01

Pos logit contributions
appointed+0.035
unct+0.032
chair+0.032
irds+0.030
 halls+0.030
Neg logit contributions
 Wally-0.046
 tuna-0.045
 Sebastian-0.043
 shrimp-0.043
 Finn-0.042
Token€
Token ID26391
Feature activation+1.48e+00

Pos logit contributions
appointed+0.010
chair+0.009
irds+0.008
unct+0.008
 halls+0.008
Neg logit contributions
 Wally-0.018
 tuna-0.018
 Sebastian-0.017
 shrimp-0.017
 fisher-0.017
Token
Token ID220
Feature activation+3.39e+00

Pos logit contributions
appointed+0.071
unct+0.066
chair+0.064
ring+0.063
bars+0.063
Neg logit contributions
 Wally-0.048
 tuna-0.046
 shrimp-0.044
 Sebastian-0.044
 named-0.042
Token
Token ID198
Feature activation+1.45e-01

Pos logit contributions
unct+0.198
appointed+0.192
 halls+0.185
erved+0.184
ring+0.184
Neg logit contributions
 Wally-0.065
 Sebastian-0.054
 Finn-0.054
 Fin-0.051
 tuna-0.049
Token
Token ID198
Feature activation+6.73e-01

Pos logit contributions
appointed+0.007
chair+0.007
irds+0.007
unct+0.007
 halls+0.006
Neg logit contributions
 Wally-0.004
 tuna-0.004
 shrimp-0.004
 Sebastian-0.004
 fisher-0.004
TokenJane
Token ID41083
Feature activation-1.61e-01

Pos logit contributions
appointed+0.028
chair+0.028
unct+0.027
irds+0.026
flash+0.025
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.023
 Sebastian-0.023
 fisher-0.022
Token said
Token ID531
Feature activation+1.23e+00

Pos logit contributions
 Wally+0.005
 tuna+0.005
 Sebastian+0.005
 shrimp+0.005
 fisher+0.005
Neg logit contributions
appointed-0.008
chair-0.007
unct-0.007
irds-0.007
 halls-0.007
Token,
Token ID11
Feature activation-6.02e-02

Pos logit contributions
appointed+0.075
chair+0.070
 halls+0.069
clock+0.068
flash+0.068
Neg logit contributions
 Wally-0.027
 tuna-0.024
 shrimp-0.024
 Sebastian-0.022
 fisher-0.020
Token looked
Token ID3114
Feature activation-8.41e-01

Pos logit contributions
appointed+0.042
chair+0.040
irds+0.037
unct+0.037
flash+0.036
Neg logit contributions
 Wally-0.029
 tuna-0.029
 shrimp-0.028
 Sebastian-0.027
 fisher-0.027
Token so
Token ID523
Feature activation+6.31e-01

Pos logit contributions
 Wally+0.038
 tuna+0.036
 fisher+0.036
 Sebastian+0.036
 Otto+0.034
Neg logit contributions
appointed-0.030
chair-0.028
unct-0.025
 halls-0.025
cases-0.024
Token pretty
Token ID2495
Feature activation-1.03e+00

Pos logit contributions
appointed+0.026
chair+0.026
irds+0.024
unct+0.023
amas+0.023
Neg logit contributions
 Wally-0.025
 tuna-0.024
 shrimp-0.024
 Sebastian-0.023
 fisher-0.022
Token and
Token ID290
Feature activation+7.92e-01

Pos logit contributions
 Wally+0.034
 fisher+0.032
 Sebastian+0.031
 tuna+0.031
 shrimp+0.029
Neg logit contributions
chair-0.048
appointed-0.046
bows-0.044
ring-0.043
cases-0.043
Token felt
Token ID2936
Feature activation-9.79e-01

Pos logit contributions
appointed+0.036
chair+0.034
irds+0.032
unct+0.032
amas+0.032
Neg logit contributions
 Wally-0.028
 tuna-0.027
 shrimp-0.027
 Sebastian-0.027
 fisher-0.025
Token so
Token ID523
Feature activation+1.30e+00

Pos logit contributions
 Wally+0.040
 fisher+0.038
 Sebastian+0.038
 tuna+0.038
 fossil+0.037
Neg logit contributions
appointed-0.037
chair-0.034
ring-0.031
filled-0.031
 halls-0.031
Token confident
Token ID6563
Feature activation-1.21e+00

Pos logit contributions
chair+0.061
appointed+0.060
irds+0.058
amas+0.057
flash+0.056
Neg logit contributions
 Wally-0.042
 shrimp-0.041
 tuna-0.040
 Sebastian-0.038
 fisher-0.037
Token in
Token ID287
Feature activation-7.90e-01

Pos logit contributions
 Wally+0.048
 tuna+0.046
 Sebastian+0.046
 fisher+0.046
 shrimp+0.045
Neg logit contributions
chair-0.048
appointed-0.046
ring-0.042
bows-0.042
iece-0.041
Token her
Token ID607
Feature activation+1.11e+00

Pos logit contributions
 Wally+0.041
 tuna+0.040
 fisher+0.039
 Sebastian+0.039
 shrimp+0.038
Neg logit contributions
appointed-0.022
chair-0.020
 halls-0.019
unct-0.018
ring-0.018
Token black
Token ID2042
Feature activation+1.37e+00

Pos logit contributions
appointed+0.046
chair+0.040
irds+0.038
flash+0.038
unct+0.038
Neg logit contributions
 tuna-0.046
 Wally-0.044
 shrimp-0.044
 Sebastian-0.041
 fisher-0.041
Token dress
Token ID6576
Feature activation+2.12e+00

Pos logit contributions
appointed+0.057
unct+0.049
flash+0.048
chair+0.047
irds+0.046
Neg logit contributions
 tuna-0.056
 Wally-0.054
 shrimp-0.054
 Sebastian-0.051
 named-0.050
Token said
Token ID531
Feature activation+9.67e-01

Pos logit contributions
appointed+0.015
chair+0.015
unct+0.014
irds+0.014
 halls+0.013
Neg logit contributions
 Wally-0.009
 tuna-0.009
 Sebastian-0.008
 shrimp-0.008
 fisher-0.008
Token it
Token ID340
Feature activation+2.06e+00

Pos logit contributions
appointed+0.056
chair+0.053
unct+0.051
clock+0.050
flash+0.050
Neg logit contributions
 Wally-0.024
 shrimp-0.023
 tuna-0.023
 Sebastian-0.021
 Finn-0.019
Token was
Token ID373
Feature activation+1.50e-01

Pos logit contributions
irds+0.119
appointed+0.118
unct+0.118
iece+0.114
ong+0.114
Neg logit contributions
 tuna-0.038
 Wally-0.037
 named-0.034
 shrimp-0.031
 Finn-0.030
Token time
Token ID640
Feature activation+7.17e-02

Pos logit contributions
appointed+0.006
chair+0.006
unct+0.005
irds+0.005
 halls+0.005
Neg logit contributions
 Wally-0.006
 tuna-0.006
 Sebastian-0.006
 shrimp-0.006
 fisher-0.006
Token for
Token ID329
Feature activation+1.13e+00

Pos logit contributions
appointed+0.003
chair+0.003
unct+0.003
irds+0.003
 halls+0.002
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 shrimp-0.003
 fisher-0.003
Token bed
Token ID3996
Feature activation+1.63e+00

Pos logit contributions
appointed+0.049
chair+0.048
irds+0.044
bars+0.043
cases+0.042
Neg logit contributions
 Wally-0.046
 tuna-0.045
 shrimp-0.043
 Sebastian-0.042
 Finn-0.040
Token.
Token ID13
Feature activation+1.63e+00

Pos logit contributions
appointed+0.093
irds+0.087
unct+0.086
chair+0.085
flash+0.084
Neg logit contributions
 tuna-0.037
 Wally-0.037
 shrimp-0.034
 Sebastian-0.032
 named-0.030
Token Lily
Token ID20037
Feature activation+1.17e+00

Pos logit contributions
unct+0.085
appointed+0.082
irds+0.081
adows+0.080
ights+0.080
Neg logit contributions
 Wally-0.048
 tuna-0.045
 shrimp-0.043
 Sebastian-0.040
 Finn-0.040
Token got
Token ID1392
Feature activation+8.04e-01

Pos logit contributions
appointed+0.055
chair+0.053
irds+0.052
bows+0.051
unct+0.050
Neg logit contributions
 Wally-0.038
 tuna-0.037
 shrimp-0.036
 Sebastian-0.035
 Finn-0.033
Token ready
Token ID3492
Feature activation-1.12e-01

Pos logit contributions
appointed+0.036
chair+0.035
irds+0.034
bows+0.032
unct+0.031
Neg logit contributions
 Wally-0.029
 tuna-0.029
 shrimp-0.028
 Sebastian-0.027
 Finn-0.026
Token for
Token ID329
Feature activation+8.18e-01

Pos logit contributions
 Wally+0.003
 tuna+0.003
 Sebastian+0.003
 shrimp+0.003
 fisher+0.003
Neg logit contributions
appointed-0.006
chair-0.006
unct-0.005
irds-0.005
 halls-0.005
Token mom
Token ID1995
Feature activation+4.21e-01

Pos logit contributions
appointed+0.067
unct+0.061
chair+0.060
irds+0.059
filled+0.059
Neg logit contributions
 tuna-0.041
 Wally-0.040
 shrimp-0.038
 fisher-0.035
 Sebastian-0.035
Token warned
Token ID7728
Feature activation+2.41e+00

Pos logit contributions
appointed+0.022
chair+0.021
irds+0.020
unct+0.020
 halls+0.020
Neg logit contributions
 Wally-0.012
 tuna-0.011
 Sebastian-0.011
 shrimp-0.011
 fisher-0.010
Token her
Token ID607
Feature activation+1.79e+00

Pos logit contributions
irds+0.133
appointed+0.131
rolling+0.131
cases+0.130
chair+0.130
Neg logit contributions
 Wally-0.059
 tuna-0.054
 them-0.049
 shrimp-0.048
 named-0.046
Token not
Token ID407
Feature activation+1.80e+00

Pos logit contributions
appointed+0.106
unct+0.101
flash+0.101
chair+0.099
amas+0.098
Neg logit contributions
 Wally-0.036
 tuna-0.034
 shrimp-0.034
 Sebastian-0.031
 named-0.030
Token to
Token ID284
Feature activation+1.65e+00

Pos logit contributions
chair+0.099
appointed+0.098
amas+0.097
unct+0.093
vation+0.092
Neg logit contributions
 Wally-0.042
 tuna-0.041
 shrimp-0.040
 Sebastian-0.036
 fisher-0.035
Token touch
Token ID3638
Feature activation+1.28e+00

Pos logit contributions
chair+0.074
appointed+0.074
amas+0.073
cases+0.073
irds+0.071
Neg logit contributions
 Wally-0.053
 tuna-0.049
 shrimp-0.048
 Sebastian-0.048
 fisher-0.048
Token the
Token ID262
Feature activation+1.45e+00

Pos logit contributions
appointed+0.071
chair+0.064
unct+0.064
irds+0.063
rolling+0.061
Neg logit contributions
 Wally-0.036
 tuna-0.035
 shrimp-0.033
 Sebastian-0.032
 fisher-0.030
Token shelf
Token ID18316
Feature activation+4.97e-01

Pos logit contributions
appointed+0.062
unct+0.057
irds+0.056
rolling+0.055
chair+0.052
Neg logit contributions
 tuna-0.060
 Wally-0.055
 fossil-0.053
 shrimp-0.053
 fisher-0.049
Token,
Token ID11
Feature activation-9.63e-02

Pos logit contributions
appointed+0.024
chair+0.023
unct+0.022
irds+0.022
flash+0.021
Neg logit contributions
 Wally-0.016
 tuna-0.015
 shrimp-0.015
 Sebastian-0.014
 fisher-0.014
Token but
Token ID475
Feature activation+2.32e-01

Pos logit contributions
 Wally+0.003
 tuna+0.003
 Sebastian+0.003
 fisher+0.003
 shrimp+0.003
Neg logit contributions
appointed-0.005
chair-0.005
irds-0.004
unct-0.004
 halls-0.004
Token she
Token ID673
Feature activation-4.04e-01

Pos logit contributions
appointed+0.009
chair+0.008
unct+0.008
irds+0.008
 halls+0.007
Neg logit contributions
 Wally-0.010
 tuna-0.010
 shrimp-0.010
 Sebastian-0.010
 fisher-0.009
Token Tim
Token ID5045
Feature activation+1.24e+00

Pos logit contributions
 Wally+0.011
 fisher+0.011
 tuna+0.011
 Sebastian+0.011
 shrimp+0.011
Neg logit contributions
appointed-0.012
chair-0.012
unct-0.011
 halls-0.011
irds-0.011
Tokenmy
Token ID1820
Feature activation+1.05e+00

Pos logit contributions
appointed+0.055
chair+0.051
irds+0.050
unct+0.047
cases+0.047
Neg logit contributions
 tuna-0.045
 Wally-0.044
 shrimp-0.042
 Sebastian-0.040
 fisher-0.040
Token's
Token ID338
Feature activation+8.56e-01

Pos logit contributions
appointed+0.042
irds+0.037
chair+0.037
unct+0.036
flash+0.034
Neg logit contributions
 tuna-0.044
 Wally-0.044
 shrimp-0.042
 Sebastian-0.040
 fisher-0.039
Token room
Token ID2119
Feature activation+1.59e-01

Pos logit contributions
appointed+0.031
irds+0.027
chair+0.027
unct+0.027
flash+0.025
Neg logit contributions
 tuna-0.040
 Wally-0.039
 shrimp-0.038
 Sebastian-0.037
 fisher-0.036
Token and
Token ID290
Feature activation+6.80e-01

Pos logit contributions
appointed+0.009
chair+0.009
unct+0.008
irds+0.008
 halls+0.008
Neg logit contributions
 Wally-0.004
 tuna-0.004
 Sebastian-0.004
 shrimp-0.004
 fisher-0.003
Token take
Token ID1011
Feature activation+1.56e+00

Pos logit contributions
appointed+0.030
chair+0.028
irds+0.026
unct+0.026
amas+0.025
Neg logit contributions
 Wally-0.026
 tuna-0.025
 shrimp-0.024
 Sebastian-0.024
 fisher-0.023
Token the
Token ID262
Feature activation+1.10e+00

Pos logit contributions
appointed+0.086
chair+0.080
irds+0.079
unct+0.076
bars+0.075
Neg logit contributions
 Wally-0.047
 tuna-0.045
 Sebastian-0.043
 shrimp-0.042
 Fin-0.040
Token key
Token ID1994
Feature activation+1.35e+00

Pos logit contributions
appointed+0.045
irds+0.040
chair+0.039
unct+0.037
ring+0.037
Neg logit contributions
 tuna-0.047
 Wally-0.046
 shrimp-0.045
 Sebastian-0.041
 fisher-0.041
Token.
Token ID13
Feature activation+3.37e-01

Pos logit contributions
appointed+0.070
unct+0.063
irds+0.063
chair+0.063
amas+0.061
Neg logit contributions
 Wally-0.037
 tuna-0.035
 Sebastian-0.034
 shrimp-0.033
 fisher-0.031
Token
Token ID220
Feature activation+1.28e+00

Pos logit contributions
appointed+0.014
chair+0.014
unct+0.013
irds+0.013
 halls+0.012
Neg logit contributions
 Wally-0.013
 tuna-0.012
 Sebastian-0.012
 shrimp-0.012
 fisher-0.012
Token
Token ID198
Feature activation-4.36e-02

Pos logit contributions
appointed+0.073
chair+0.070
unct+0.069
 halls+0.067
ring+0.067
Neg logit contributions
 Wally-0.031
 Sebastian-0.027
 tuna-0.027
 Finn-0.024
 shrimp-0.024
Token lived
Token ID5615
Feature activation-3.88e-01

Pos logit contributions
appointed+0.042
chair+0.040
irds+0.039
unct+0.039
amas+0.037
Neg logit contributions
 tuna-0.024
 Wally-0.024
 shrimp-0.022
 Sebastian-0.022
 fossil-0.022
Token in
Token ID287
Feature activation+6.29e-01

Pos logit contributions
 Wally+0.013
 Sebastian+0.012
 tuna+0.012
 shrimp+0.012
 fisher+0.011
Neg logit contributions
appointed-0.018
chair-0.017
unct-0.016
flash-0.016
 halls-0.016
Token a
Token ID257
Feature activation+7.78e-01

Pos logit contributions
appointed+0.032
chair+0.030
irds+0.029
unct+0.028
cases+0.028
Neg logit contributions
 Wally-0.020
 tuna-0.020
 shrimp-0.019
 Sebastian-0.018
 fisher-0.018
Token big
Token ID1263
Feature activation-2.81e-01

Pos logit contributions
appointed+0.033
chair+0.031
irds+0.030
unct+0.029
ring+0.028
Neg logit contributions
 Wally-0.031
 tuna-0.031
 shrimp-0.030
 fisher-0.028
 Sebastian-0.028
Token castle
Token ID16669
Feature activation+2.83e-01

Pos logit contributions
 Wally+0.013
 Sebastian+0.012
 tuna+0.012
 shrimp+0.012
 fisher+0.012
Neg logit contributions
appointed-0.009
chair-0.009
unct-0.008
 halls-0.008
irds-0.008
Token.
Token ID13
Feature activation+1.05e+00

Pos logit contributions
appointed+0.014
chair+0.013
unct+0.013
irds+0.013
 halls+0.012
Neg logit contributions
 Wally-0.009
 tuna-0.008
 shrimp-0.008
 Sebastian-0.008
 fisher-0.008
Token He
Token ID679
Feature activation-3.53e-01

Pos logit contributions
appointed+0.051
unct+0.050
irds+0.048
chair+0.046
ring+0.045
Neg logit contributions
 Wally-0.035
 tuna-0.034
 Sebastian-0.031
 shrimp-0.030
 fossil-0.030
Token was
Token ID373
Feature activation-9.44e-02

Pos logit contributions
 Wally+0.015
 Sebastian+0.014
 tuna+0.014
 shrimp+0.013
 fisher+0.013
Neg logit contributions
appointed-0.014
chair-0.014
 halls-0.013
unct-0.012
irds-0.012
Token very
Token ID845
Feature activation+3.29e-01

Pos logit contributions
 Wally+0.004
 tuna+0.003
 Sebastian+0.003
 shrimp+0.003
 fisher+0.003
Neg logit contributions
appointed-0.004
chair-0.004
unct-0.004
 halls-0.004
flash-0.003
Token tall
Token ID7331
Feature activation-1.33e+00

Pos logit contributions
appointed+0.014
chair+0.013
irds+0.012
unct+0.012
 halls+0.012
Neg logit contributions
 Wally-0.013
 tuna-0.012
 shrimp-0.012
 Sebastian-0.012
 fisher-0.012
Token and
Token ID290
Feature activation+7.33e-01

Pos logit contributions
 Wally+0.043
 Sebastian+0.041
 shrimp+0.037
 tuna+0.037
 Finn+0.036
Neg logit contributions
chair-0.067
appointed-0.060
 halls-0.058
clock-0.058
bows-0.058
Token with
Token ID351
Feature activation-2.49e-01

Pos logit contributions
 Sebastian+0.077
 Wally+0.076
 fossil+0.071
 Otto+0.070
 fisher+0.069
Neg logit contributions
chair-0.077
bows-0.064
appointed-0.063
clock-0.062
est-0.062
Token princess
Token ID21752
Feature activation+7.72e-02

Pos logit contributions
 Wally+0.009
 tuna+0.008
 Sebastian+0.008
 shrimp+0.008
 fisher+0.008
Neg logit contributions
appointed-0.011
chair-0.011
irds-0.010
unct-0.010
 halls-0.010
Tokenes
Token ID274
Feature activation-1.26e+00

Pos logit contributions
appointed+0.003
chair+0.003
irds+0.003
unct+0.003
 halls+0.003
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 shrimp-0.003
 fisher-0.002
Token and
Token ID290
Feature activation-2.69e-01

Pos logit contributions
 Wally+0.040
 fisher+0.038
 Sebastian+0.038
 tuna+0.037
 fossil+0.036
Neg logit contributions
chair-0.058
appointed-0.056
bows-0.053
cases-0.052
ring-0.052
Token fair
Token ID3148
Feature activation+1.25e+00

Pos logit contributions
 Wally+0.009
 tuna+0.009
 Sebastian+0.009
 shrimp+0.009
 fisher+0.009
Neg logit contributions
appointed-0.012
chair-0.012
irds-0.011
 halls-0.011
unct-0.010
Tokenies
Token ID444
Feature activation-1.13e+00

Pos logit contributions
appointed+0.050
chair+0.044
unct+0.042
irds+0.039
flash+0.039
Neg logit contributions
 tuna-0.054
 shrimp-0.054
 Wally-0.053
 fisher-0.051
 Sebastian-0.050
Token.
Token ID13
Feature activation-3.30e-01

Pos logit contributions
 Wally+0.044
 Sebastian+0.043
 tuna+0.042
 fisher+0.041
 shrimp+0.040
Neg logit contributions
chair-0.044
appointed-0.043
bows-0.039
ring-0.039
irds-0.038
Token
Token ID198
Feature activation-9.21e-01

Pos logit contributions
 Wally+0.012
 tuna+0.011
 shrimp+0.011
 fisher+0.011
 Sebastian+0.011
Neg logit contributions
appointed-0.015
chair-0.014
irds-0.013
 halls-0.013
clock-0.013
Token
Token ID198
Feature activation-4.65e-01

Pos logit contributions
 Wally+0.035
 shrimp+0.034
 tuna+0.034
 fisher+0.034
 Sebastian+0.033
Neg logit contributions
chair-0.038
appointed-0.038
irds-0.033
 halls-0.033
flash-0.033
TokenOne
Token ID3198
Feature activation+1.11e+00

Pos logit contributions
 Wally+0.020
 tuna+0.019
 Sebastian+0.019
 shrimp+0.019
 fisher+0.018
Neg logit contributions
appointed-0.019
chair-0.017
 halls-0.015
irds-0.015
clock-0.015
Token day
Token ID1110
Feature activation-1.52e-04

Pos logit contributions
appointed+0.043
unct+0.040
amas+0.038
irds+0.038
cases+0.037
Neg logit contributions
 tuna-0.049
 Wally-0.047
 shrimp-0.047
 fisher-0.045
 fisherman-0.043
Token Lily
Token ID20037
Feature activation+4.50e-01

Pos logit contributions
appointed+0.059
amas+0.054
chair+0.053
cases+0.052
irds+0.052
Neg logit contributions
 Wally-0.029
 tuna-0.027
 shrimp-0.025
 Finn-0.024
 Sebastian-0.024
Token say
Token ID910
Feature activation+8.34e-01

Pos logit contributions
appointed+0.022
chair+0.021
irds+0.020
unct+0.020
 halls+0.019
Neg logit contributions
 Wally-0.014
 tuna-0.014
 shrimp-0.013
 Sebastian-0.013
 fisher-0.013
Token.
Token ID13
Feature activation-1.11e+00

Pos logit contributions
appointed+0.045
chair+0.043
unct+0.041
irds+0.040
 halls+0.040
Neg logit contributions
 Wally-0.023
 tuna-0.022
 shrimp-0.021
 fisher-0.020
 Sebastian-0.020
Token
Token ID198
Feature activation-7.15e-01

Pos logit contributions
 fossil+0.049
 shrimp+0.049
 tuna+0.049
 Wally+0.048
 fisher+0.048
Neg logit contributions
chair-0.036
appointed-0.034
clock-0.031
 Lightning-0.031
EEP-0.030
Token
Token ID198
Feature activation-4.59e-01

Pos logit contributions
 Wally+0.026
 shrimp+0.026
 tuna+0.026
 fisher+0.025
 Sebastian+0.025
Neg logit contributions
appointed-0.030
chair-0.030
irds-0.027
 halls-0.026
flash-0.026
Token"
Token ID1
Feature activation-8.78e-01

Pos logit contributions
 Wally+0.017
 tuna+0.017
 shrimp+0.017
 Sebastian+0.017
 fisher+0.016
Neg logit contributions
appointed-0.020
chair-0.019
irds-0.017
 halls-0.017
unct-0.016
TokenYes
Token ID5297
Feature activation-1.52e-01

Pos logit contributions
 Wally+0.037
 tuna+0.036
 Sebastian+0.036
 shrimp+0.034
 fossil+0.034
Neg logit contributions
appointed-0.032
chair-0.029
bows-0.028
irds-0.028
clock-0.026
Token,
Token ID11
Feature activation-9.84e-01

Pos logit contributions
 Wally+0.005
 tuna+0.005
 Sebastian+0.005
 shrimp+0.005
 fisher+0.005
Neg logit contributions
appointed-0.007
chair-0.007
irds-0.006
unct-0.006
 halls-0.006
Token I
Token ID314
Feature activation-1.21e+00

Pos logit contributions
 Wally+0.043
 fisher+0.042
 Sebastian+0.042
 tuna+0.042
 fossil+0.041
Neg logit contributions
chair-0.034
appointed-0.030
irds-0.028
 roof-0.028
cases-0.026
Token have
Token ID423
Feature activation-7.67e-02

Pos logit contributions
 Wally+0.062
 shrimp+0.058
 tuna+0.058
 Sebastian+0.058
 Otto+0.057
Neg logit contributions
appointed-0.033
bows-0.033
up-0.032
age-0.031
chair-0.030
Token.
Token ID13
Feature activation-5.51e-01

Pos logit contributions
 Wally+0.003
 Sebastian+0.003
 tuna+0.003
 shrimp+0.003
 fisher+0.003
Neg logit contributions
appointed-0.003
chair-0.003
irds-0.003
unct-0.003
 halls-0.003
Token starting
Token ID3599
Feature activation-8.57e-01

Pos logit contributions
 Wally+0.026
 Sebastian+0.025
 tuna+0.025
 shrimp+0.025
 fisher+0.025
Neg logit contributions
appointed-0.027
chair-0.025
ring-0.022
unct-0.022
 halls-0.022
Token to
Token ID284
Feature activation-1.15e-01

Pos logit contributions
 Wally+0.040
 tuna+0.039
 Sebastian+0.039
 shrimp+0.038
 fisher+0.038
Neg logit contributions
appointed-0.028
chair-0.024
clock-0.023
unct-0.023
irds-0.022
Token get
Token ID651
Feature activation-9.40e-01

Pos logit contributions
 Wally+0.005
 tuna+0.005
 shrimp+0.005
 Sebastian+0.005
 fisher+0.005
Neg logit contributions
appointed-0.004
chair-0.003
unct-0.003
irds-0.003
 halls-0.003
Token really
Token ID1107
Feature activation-8.34e-01

Pos logit contributions
 Wally+0.041
 fossil+0.041
 fisher+0.040
 Sebastian+0.040
 tuna+0.040
Neg logit contributions
appointed-0.032
chair-0.027
unct-0.026
flash-0.026
clock-0.025
Token frustrated
Token ID14718
Feature activation-1.31e+00

Pos logit contributions
 Wally+0.037
 Sebastian+0.036
 tuna+0.036
 fossil+0.035
 fisher+0.035
Neg logit contributions
appointed-0.030
chair-0.024
unct-0.024
 halls-0.022
irds-0.022
Token.
Token ID13
Feature activation+6.13e-01

Pos logit contributions
 Wally+0.048
 Sebastian+0.046
 fisher+0.045
 shrimp+0.044
 tuna+0.044
Neg logit contributions
chair-0.057
appointed-0.054
ness-0.048
ring-0.048
cases-0.048
Token He
Token ID679
Feature activation-3.60e-01

Pos logit contributions
appointed+0.026
chair+0.026
unct+0.024
irds+0.024
 halls+0.023
Neg logit contributions
 Wally-0.023
 tuna-0.022
 Sebastian-0.021
 shrimp-0.021
 fisher-0.020
Token started
Token ID2067
Feature activation-3.76e-01

Pos logit contributions
 Wally+0.015
 tuna+0.014
 Sebastian+0.014
 shrimp+0.014
 fisher+0.014
Neg logit contributions
appointed-0.014
chair-0.014
unct-0.012
 halls-0.012
irds-0.012
Token to
Token ID284
Feature activation+2.20e-01

Pos logit contributions
 Wally+0.016
 tuna+0.016
 Sebastian+0.015
 shrimp+0.015
 fisher+0.015
Neg logit contributions
appointed-0.014
chair-0.013
unct-0.012
irds-0.012
 halls-0.012
Token wonder
Token ID4240
Feature activation-1.56e+00

Pos logit contributions
appointed+0.007
chair+0.007
unct+0.006
irds+0.006
 halls+0.006
Neg logit contributions
 Wally-0.011
 tuna-0.010
 shrimp-0.010
 Sebastian-0.010
 fisher-0.010
Token if
Token ID611
Feature activation+5.82e-01

Pos logit contributions
 Wally+0.054
 tuna+0.052
 fisher+0.051
 Sebastian+0.050
 shrimp+0.050
Neg logit contributions
chair-0.068
appointed-0.068
unct-0.063
cases-0.062
irds-0.062
TokenOnce
Token ID7454
Feature activation+8.08e-01

Pos logit contributions
 Wally+0.027
 shrimp+0.027
 fisher+0.026
 tuna+0.026
 Sebastian+0.026
Neg logit contributions
chair-0.032
appointed-0.032
amas-0.028
 halls-0.028
flash-0.028
Token upon
Token ID2402
Feature activation+1.71e+00

Pos logit contributions
appointed+0.033
chair+0.030
unct+0.028
irds+0.028
ring+0.028
Neg logit contributions
 Wally-0.033
 tuna-0.031
 shrimp-0.031
 Sebastian-0.031
 Finn-0.029
Token a
Token ID257
Feature activation-7.53e-01

Pos logit contributions
appointed+0.098
unct+0.093
irds+0.091
chair+0.090
ring+0.088
Neg logit contributions
 Wally-0.044
 tuna-0.044
 shrimp-0.039
 Sebastian-0.035
 Finn-0.035
Token time
Token ID640
Feature activation-6.94e-01

Pos logit contributions
 Wally+0.034
 Sebastian+0.034
 tuna+0.033
 shrimp+0.032
 fisher+0.032
Neg logit contributions
appointed-0.027
chair-0.025
irds-0.022
unct-0.021
 halls-0.021
Token,
Token ID11
Feature activation-5.16e-01

Pos logit contributions
 Wally+0.025
 tuna+0.025
 Sebastian+0.024
 shrimp+0.024
 fisher+0.024
Neg logit contributions
appointed-0.029
chair-0.029
irds-0.026
unct-0.025
 halls-0.025
Token there
Token ID612
Feature activation-7.61e-01

Pos logit contributions
 Wally+0.017
 Sebastian+0.016
 tuna+0.016
 fisher+0.016
 shrimp+0.016
Neg logit contributions
appointed-0.024
chair-0.024
irds-0.021
 halls-0.021
flash-0.021
Token lived
Token ID5615
Feature activation-9.15e-01

Pos logit contributions
 Wally+0.035
 fisher+0.034
 shrimp+0.034
 tuna+0.034
 Sebastian+0.033
Neg logit contributions
chair-0.025
appointed-0.024
 halls-0.020
irds-0.020
 ceiling-0.020
Token a
Token ID257
Feature activation-9.61e-01

Pos logit contributions
 Sebastian+0.040
 Wally+0.040
 fisher+0.038
 tuna+0.038
 fossil+0.038
Neg logit contributions
appointed-0.029
chair-0.025
flash-0.025
irds-0.024
 halls-0.024
Token mighty
Token ID18680
Feature activation-6.10e-01

Pos logit contributions
 Wally+0.049
 Sebastian+0.048
 tuna+0.046
 shrimp+0.045
 Finn+0.044
Neg logit contributions
chair-0.025
appointed-0.024
 roof-0.021
 halls-0.021
flash-0.021
Token dragon
Token ID10441
Feature activation-1.19e+00

Pos logit contributions
 Wally+0.025
 Sebastian+0.024
 tuna+0.023
 shrimp+0.023
 fisher+0.022
Neg logit contributions
chair-0.023
appointed-0.022
irds-0.020
unct-0.020
 halls-0.020
Token.
Token ID13
Feature activation+9.09e-01

Pos logit contributions
 Wally+0.034
 Sebastian+0.032
 tuna+0.030
 Otto+0.029
 shrimp+0.028
Neg logit contributions
chair-0.062
appointed-0.058
ring-0.056
irds-0.055
age-0.055
Token new
Token ID649
Feature activation-1.10e+00

Pos logit contributions
 Wally+0.018
 tuna+0.018
 shrimp+0.018
 fisher+0.017
 Sebastian+0.017
Neg logit contributions
appointed-0.015
chair-0.015
irds-0.013
unct-0.013
 halls-0.013
Token flag
Token ID6056
Feature activation-4.45e-01

Pos logit contributions
 Wally+0.048
 Sebastian+0.045
 tuna+0.044
 fisher+0.044
 shrimp+0.043
Neg logit contributions
chair-0.039
appointed-0.038
irds-0.034
 roof-0.033
bows-0.033
Token.
Token ID13
Feature activation-8.68e-01

Pos logit contributions
 Wally+0.015
 tuna+0.015
 Sebastian+0.014
 shrimp+0.014
 fisher+0.014
Neg logit contributions
appointed-0.020
chair-0.020
irds-0.018
unct-0.018
ring-0.018
Token
Token ID198
Feature activation-7.64e-01

Pos logit contributions
 Wally+0.037
 tuna+0.036
 shrimp+0.036
 fisher+0.035
 Sebastian+0.035
Neg logit contributions
appointed-0.032
chair-0.031
irds-0.027
clock-0.026
 halls-0.026
Token
Token ID198
Feature activation-6.70e-01

Pos logit contributions
 Wally+0.030
 tuna+0.029
 shrimp+0.029
 Sebastian+0.028
 fisher+0.028
Neg logit contributions
appointed-0.031
chair-0.030
irds-0.027
 halls-0.027
flash-0.026
TokenEvery
Token ID6109
Feature activation+4.45e-01

Pos logit contributions
 Wally+0.027
 tuna+0.027
 shrimp+0.026
 Sebastian+0.026
 fisher+0.025
Neg logit contributions
appointed-0.027
chair-0.025
irds-0.023
 halls-0.022
unct-0.022
Token day
Token ID1110
Feature activation-3.52e-01

Pos logit contributions
appointed+0.013
chair+0.012
unct+0.011
irds+0.011
 halls+0.011
Neg logit contributions
 Wally-0.023
 tuna-0.022
 shrimp-0.022
 Sebastian-0.022
 fisher-0.021
Token she
Token ID673
Feature activation-5.78e-01

Pos logit contributions
 Wally+0.012
 tuna+0.011
 Sebastian+0.011
 shrimp+0.011
 fisher+0.011
Neg logit contributions
appointed-0.016
chair-0.016
irds-0.014
 halls-0.014
unct-0.014
Token made
Token ID925
Feature activation+2.18e+00

Pos logit contributions
 Wally+0.025
 tuna+0.023
 Sebastian+0.023
 shrimp+0.023
 fisher+0.022
Neg logit contributions
appointed-0.021
chair-0.021
unct-0.019
 halls-0.019
irds-0.019
Token sure
Token ID1654
Feature activation+1.12e+00

Pos logit contributions
appointed+0.124
chair+0.118
EEP+0.112
unct+0.111
amas+0.109
Neg logit contributions
 tuna-0.059
 shrimp-0.059
 Wally-0.057
 Sebastian-0.055
 fish-0.051
Token to
Token ID284
Feature activation-4.43e-01

Pos logit contributions
appointed+0.053
chair+0.048
amas+0.047
unct+0.047
flash+0.046
Neg logit contributions
 Wally-0.040
 tuna-0.038
 shrimp-0.038
 Sebastian-0.037
 fisher-0.034
Token is
Token ID318
Feature activation-2.27e+00

Pos logit contributions
 Wally+0.037
 Sebastian+0.036
 shrimp+0.035
 fisher+0.034
 tuna+0.034
Neg logit contributions
chair-0.030
appointed-0.029
unct-0.026
bows-0.026
up-0.025
Token very
Token ID845
Feature activation-1.32e+00

Pos logit contributions
 fisher+0.106
 Sebastian+0.104
 Wally+0.103
 shrimp+0.099
 Finn+0.098
Neg logit contributions
appointed-0.051
chair-0.050
clock-0.047
 biggest-0.045
 same-0.044
Token old
Token ID1468
Feature activation-2.40e+00

Pos logit contributions
 Wally+0.064
 Sebastian+0.062
 fisher+0.059
 tuna+0.058
 Finn+0.058
Neg logit contributions
appointed-0.035
chair-0.033
up-0.031
unct-0.031
clock-0.029
Token and
Token ID290
Feature activation-5.15e-01

Pos logit contributions
 Wally+0.096
 Sebastian+0.094
 Finn+0.085
 Fin+0.085
 Otto+0.083
Neg logit contributions
chair-0.103
clock-0.087
bows-0.084
est-0.083
 ceiling-0.081
Token has
Token ID468
Feature activation-1.73e-01

Pos logit contributions
 Wally+0.021
 Sebastian+0.020
 fisher+0.020
 shrimp+0.019
 tuna+0.019
Neg logit contributions
appointed-0.020
chair-0.020
unct-0.017
 halls-0.017
clock-0.017
Token a
Token ID257
Feature activation-2.00e+00

Pos logit contributions
 Wally+0.007
 Sebastian+0.006
 tuna+0.006
 shrimp+0.006
 fisher+0.006
Neg logit contributions
appointed-0.007
chair-0.007
unct-0.006
irds-0.006
 halls-0.006
Token stitch
Token ID24695
Feature activation-1.13e+00

Pos logit contributions
 Wally+0.103
 Finn+0.098
 Sebastian+0.098
 fisher+0.095
 Fin+0.095
Neg logit contributions
 roof-0.051
chair-0.051
 ceiling-0.046
appointed-0.045
age-0.042
Token,
Token ID11
Feature activation-1.49e+00

Pos logit contributions
 Wally+0.042
 Sebastian+0.040
 tuna+0.038
 fisher+0.038
 shrimp+0.038
Neg logit contributions
chair-0.050
appointed-0.046
ring-0.044
bows-0.043
irds-0.042
Token but
Token ID475
Feature activation-1.56e+00

Pos logit contributions
 Wally+0.048
 fisher+0.048
 Sebastian+0.046
 shrimp+0.046
 Finn+0.043
Neg logit contributions
chair-0.066
appointed-0.063
bows-0.058
ring-0.058
clock-0.057
Token he
Token ID339
Feature activation-1.57e+00

Pos logit contributions
 Wally+0.057
 fisher+0.056
 Sebastian+0.056
 shrimp+0.054
 tuna+0.051
Neg logit contributions
chair-0.062
appointed-0.057
clock-0.054
bars-0.054
 halls-0.052
Token is
Token ID318
Feature activation-2.13e+00

Pos logit contributions
 Sebastian+0.071
 Wally+0.071
 shrimp+0.067
 fisher+0.067
 Crab+0.064
Neg logit contributions
chair-0.054
appointed-0.048
up-0.045
bows-0.045
unct-0.044
Token,
Token ID11
Feature activation-5.17e-01

Pos logit contributions
 Wally+0.034
 Sebastian+0.033
 fisher+0.033
 tuna+0.033
 shrimp+0.032
Neg logit contributions
chair-0.041
appointed-0.040
cases-0.037
bows-0.036
unct-0.035
Token but
Token ID475
Feature activation-1.10e-01

Pos logit contributions
 Wally+0.014
 tuna+0.014
 fisher+0.013
 shrimp+0.013
 Sebastian+0.013
Neg logit contributions
appointed-0.028
chair-0.027
irds-0.024
unct-0.024
 halls-0.023
Token her
Token ID607
Feature activation+5.46e-01

Pos logit contributions
 Wally+0.006
 tuna+0.006
 shrimp+0.005
 Sebastian+0.005
 fisher+0.005
Neg logit contributions
appointed-0.003
chair-0.003
unct-0.003
irds-0.003
 halls-0.002
Token mom
Token ID1995
Feature activation-3.12e-01

Pos logit contributions
appointed+0.022
chair+0.020
irds+0.019
unct+0.019
 halls+0.018
Neg logit contributions
 Wally-0.023
 tuna-0.022
 shrimp-0.022
 Sebastian-0.021
 fisher-0.021
Tokenmy
Token ID1820
Feature activation-1.21e+00

Pos logit contributions
 Wally+0.011
 tuna+0.010
 shrimp+0.010
 Sebastian+0.010
 fisher+0.010
Neg logit contributions
appointed-0.015
chair-0.014
unct-0.013
irds-0.013
ring-0.012
Token was
Token ID373
Feature activation-1.82e+00

Pos logit contributions
 tuna+0.044
 Wally+0.043
 fossil+0.042
 shrimp+0.042
 fisher+0.041
Neg logit contributions
appointed-0.052
chair-0.049
unct-0.045
cases-0.045
up-0.044
Token persistent
Token ID16218
Feature activation-7.95e-01

Pos logit contributions
 fossil+0.084
 Wally+0.081
 Finn+0.081
 tuna+0.080
 Fin+0.078
Neg logit contributions
appointed-0.065
ics-0.050
chair-0.048
clock-0.047
rolling-0.046
Token and
Token ID290
Feature activation+2.05e-01

Pos logit contributions
 Wally+0.029
 Sebastian+0.028
 fisher+0.027
 fossil+0.027
 tuna+0.027
Neg logit contributions
chair-0.035
appointed-0.035
ring-0.030
clock-0.030
oga-0.030
Token made
Token ID925
Feature activation+1.92e+00

Pos logit contributions
appointed+0.010
chair+0.009
unct+0.009
irds+0.009
 halls+0.009
Neg logit contributions
 Wally-0.007
 tuna-0.006
 Sebastian-0.006
 shrimp-0.006
 fisher-0.006
Token sure
Token ID1654
Feature activation+1.12e+00

Pos logit contributions
appointed+0.097
chair+0.095
irds+0.093
 halls+0.088
ring+0.088
Neg logit contributions
 tuna-0.065
 Wally-0.063
 shrimp-0.059
 Finn-0.055
 Sebastian-0.055
Token she
Token ID673
Feature activation-5.68e-01

Pos logit contributions
appointed+0.054
chair+0.049
unct+0.048
irds+0.047
amas+0.047
Neg logit contributions
 Wally-0.039
 tuna-0.037
 shrimp-0.036
 Sebastian-0.035
 Finn-0.033
Token friend
Token ID1545
Feature activation-1.38e+00

Pos logit contributions
 Wally+0.059
 Sebastian+0.056
 tuna+0.053
 Finn+0.053
 Otto+0.052
Neg logit contributions
chair-0.029
irds-0.028
appointed-0.027
 roof-0.027
ness-0.026
Token too
Token ID1165
Feature activation-5.16e-01

Pos logit contributions
 Sebastian+0.063
 Wally+0.061
 tuna+0.059
 shrimp+0.059
 fisher+0.058
Neg logit contributions
chair-0.050
appointed-0.041
clock-0.041
ring-0.040
bye-0.039
Token.
Token ID13
Feature activation-1.33e+00

Pos logit contributions
 Wally+0.021
 Sebastian+0.020
 tuna+0.020
 shrimp+0.020
 fisher+0.019
Neg logit contributions
appointed-0.021
chair-0.020
clock-0.017
 halls-0.017
unct-0.017
Token You
Token ID921
Feature activation-8.99e-01

Pos logit contributions
 shrimp+0.062
 Wally+0.060
 tuna+0.060
 Sebastian+0.060
 fisher+0.058
Neg logit contributions
chair-0.041
appointed-0.041
irds-0.036
bye-0.035
 Lightning-0.035
Token always
Token ID1464
Feature activation-1.28e+00

Pos logit contributions
 Wally+0.040
 Sebastian+0.040
 shrimp+0.038
 tuna+0.038
 Otto+0.036
Neg logit contributions
appointed-0.030
chair-0.028
bows-0.026
flash-0.026
unct-0.026
Token share
Token ID2648
Feature activation-2.13e+00

Pos logit contributions
 Sebastian+0.058
 Wally+0.057
 shrimp+0.055
 Finn+0.053
 fisher+0.052
Neg logit contributions
appointed-0.043
unct-0.037
chair-0.036
ring-0.036
bows-0.035
Token with
Token ID351
Feature activation-8.83e-01

Pos logit contributions
 Sebastian+0.094
 fisher+0.091
 Wally+0.089
 Finn+0.088
 fisherman+0.086
Neg logit contributions
chair-0.061
 walls-0.059
appointed-0.056
 floors-0.055
 sides-0.055
Token me
Token ID502
Feature activation+9.66e-02

Pos logit contributions
 Sebastian+0.039
 Wally+0.039
 fisher+0.038
 tuna+0.037
 shrimp+0.037
Neg logit contributions
chair-0.026
appointed-0.026
irds-0.024
 halls-0.024
unct-0.024
Token."
Token ID526
Feature activation-1.18e+00

Pos logit contributions
appointed+0.004
chair+0.004
irds+0.004
unct+0.004
 halls+0.004
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 shrimp-0.003
 fisher-0.003
Token
Token ID198
Feature activation-5.88e-01

Pos logit contributions
 Sebastian+0.049
 fossil+0.049
 fisher+0.049
 tuna+0.048
 Wally+0.047
Neg logit contributions
chair-0.040
appointed-0.039
irds-0.035
 Her-0.034
 halls-0.033
Token
Token ID198
Feature activation-9.99e-01

Pos logit contributions
 Wally+0.021
 Sebastian+0.020
 tuna+0.020
 shrimp+0.020
 fisher+0.020
Neg logit contributions
appointed-0.025
chair-0.025
irds-0.023
unct-0.022
 halls-0.022
Token needed
Token ID2622
Feature activation-7.93e-01

Pos logit contributions
appointed+0.004
chair+0.003
irds+0.003
unct+0.003
 halls+0.003
Neg logit contributions
 Wally-0.003
 tuna-0.003
 Sebastian-0.003
 shrimp-0.003
 fisher-0.003
Token a
Token ID257
Feature activation-8.63e-01

Pos logit contributions
 Wally+0.036
 Sebastian+0.034
 fisher+0.034
 tuna+0.033
 Finn+0.033
Neg logit contributions
appointed-0.025
chair-0.024
clock-0.022
irds-0.022
 halls-0.022
Token special
Token ID2041
Feature activation-1.51e+00

Pos logit contributions
 Wally+0.040
 Sebastian+0.038
 fisher+0.036
 tuna+0.036
 Finn+0.036
Neg logit contributions
appointed-0.027
chair-0.027
unct-0.024
 roof-0.024
irds-0.024
Token item
Token ID2378
Feature activation-2.87e+00

Pos logit contributions
 Wally+0.069
 Sebastian+0.067
 Finn+0.064
 fisher+0.064
 Fin+0.063
Neg logit contributions
chair-0.050
appointed-0.047
 roof-0.043
 ceiling-0.042
filled-0.042
Token to
Token ID284
Feature activation-1.20e+00

Pos logit contributions
 Wally+0.130
 fisher+0.127
 Sebastian+0.125
 tuna+0.120
 Otto+0.120
Neg logit contributions
chair-0.098
cases-0.083
 roof-0.082
ring-0.080
appointed-0.079
Token finish
Token ID5461
Feature activation-2.14e+00

Pos logit contributions
 Wally+0.055
 fisher+0.052
 tuna+0.051
 Sebastian+0.051
 shrimp+0.051
Neg logit contributions
appointed-0.038
chair-0.036
 roof-0.033
 halls-0.033
unct-0.033
Token their
Token ID511
Feature activation+9.94e-01

Pos logit contributions
 Wally+0.086
 fisher+0.084
 fisherman+0.081
 fossil+0.079
 Sebastian+0.078
Neg logit contributions
appointed-0.072
 floors-0.069
clock-0.068
 halls-0.068
 walls-0.068
Token collection
Token ID4947
Feature activation-1.30e-01

Pos logit contributions
appointed+0.038
chair+0.036
irds+0.034
ring+0.032
flash+0.032
Neg logit contributions
 tuna-0.043
 Wally-0.043
 shrimp-0.042
 Sebastian-0.039
 fisher-0.039
Token.
Token ID13
Feature activation+8.05e-01

Pos logit contributions
 Wally+0.005
 tuna+0.004
 Sebastian+0.004
 shrimp+0.004
 fisher+0.004
Neg logit contributions
appointed-0.006
chair-0.006
irds-0.005
unct-0.005
 halls-0.005
Token L
Token ID406
Feature activation+6.34e-01

Pos logit contributions
appointed+0.035
chair+0.033
unct+0.032
irds+0.031
flash+0.030
Neg logit contributions
 Wally-0.030
 tuna-0.029
 Sebastian-0.028
 shrimp-0.028
 Finn-0.028
Tokenola
Token ID5708
Feature activation-4.52e-01

Pos logit contributions
appointed+0.030
chair+0.029
unct+0.026
irds+0.026
flash+0.026
Neg logit contributions
 Wally-0.021
 shrimp-0.020
 tuna-0.020
 Sebastian-0.020
 fisher-0.020
Token to
Token ID284
Feature activation-1.22e+00

Pos logit contributions
 Wally+0.024
 Sebastian+0.024
 fisher+0.023
 fossil+0.022
 shrimp+0.022
Neg logit contributions
appointed-0.019
chair-0.019
 halls-0.017
unct-0.017
clock-0.017
Token eat
Token ID4483
Feature activation-1.91e+00

Pos logit contributions
 Sebastian+0.057
 Wally+0.057
 fossil+0.054
 fisher+0.053
 Finn+0.052
Neg logit contributions
appointed-0.032
unct-0.030
 halls-0.028
chair-0.028
oga-0.027
Token.
Token ID13
Feature activation+8.56e-01

Pos logit contributions
 Sebastian+0.087
 fisher+0.086
 fossil+0.085
 Wally+0.081
 fisherman+0.080
Neg logit contributions
chair-0.048
 ceiling-0.045
 walls-0.044
appointed-0.044
ness-0.043
Token She
Token ID1375
Feature activation-1.68e+00

Pos logit contributions
appointed+0.033
unct+0.031
chair+0.030
irds+0.030
ring+0.029
Neg logit contributions
 Wally-0.037
 tuna-0.035
 shrimp-0.034
 Sebastian-0.033
 Finn-0.033
Token thinks
Token ID6834
Feature activation-4.00e-01

Pos logit contributions
 Sebastian+0.082
 Wally+0.079
 shrimp+0.074
 fisher+0.074
 Fin+0.073
Neg logit contributions
chair-0.048
appointed-0.045
 halls-0.045
clock-0.043
unct-0.043
Token tuna
Token ID38883
Feature activation-2.34e+00

Pos logit contributions
 Wally+0.013
 Sebastian+0.013
 fisher+0.012
 shrimp+0.012
 tuna+0.012
Neg logit contributions
chair-0.018
appointed-0.018
 halls-0.016
irds-0.016
unct-0.016
Token is
Token ID318
Feature activation-2.89e+00

Pos logit contributions
 Sebastian+0.125
 Wally+0.114
 fisher+0.111
 named+0.110
 Crab+0.105
Neg logit contributions
bows-0.052
chair-0.051
up-0.050
 sounds-0.050
appointed-0.046
Token the
Token ID262
Feature activation-2.24e+00

Pos logit contributions
 fisher+0.145
 Sebastian+0.142
 fisherman+0.137
Dave+0.132
olph+0.132
Neg logit contributions
appointed-0.056
 biggest-0.055
 sounds-0.055
 same-0.046
clock-0.045
Token best
Token ID1266
Feature activation-2.77e+00

Pos logit contributions
 Sebastian+0.129
 Wally+0.127
 named+0.127
 Fin+0.123
 Finn+0.122
Neg logit contributions
 biggest-0.037
 same-0.035
clock-0.033
 ceiling-0.027
appointed-0.026
Token food
Token ID2057
Feature activation-2.72e+00

Pos logit contributions
 Sebastian+0.095
 named+0.091
 Wally+0.090
Bruce+0.089
Stan+0.089
Neg logit contributions
ness-0.103
est-0.099
night-0.094
 job-0.091
 morning-0.091
Token in
Token ID287
Feature activation-1.70e+00

Pos logit contributions
 Sebastian+0.131
 fisher+0.124
 Wally+0.119
 fossil+0.115
 tuna+0.114
Neg logit contributions
chair-0.088
bows-0.072
cakes-0.068
case-0.067
filled-0.067