TOP ACTIVATIONS
MAX = 0.237

. It 's okay . Acc idents happen . Just be
" It 's OK . Acc idents happen . I 'm
's okay , Tom . Acc idents happen . I 'm
't cry , boys . Acc idents happen . I have
It was an accident . Acc idents happen . I 'm
's okay , Anna . Acc idents happen . I 'm
. It is okay . Acc idents happen . Let me
are proud of you . Acc idents happen . We can
's okay , Lily . Acc idents happen . I 'm
okay , sweet ie . Acc idents happen . Let 's
's okay , Lily . Acc idents happen . Let 's
of the story is : Acc idents happen , but we
's okay , Lily . Acc idents happen . Let 's
's okay , Anna . Acc idents happen . You don
's okay , Lily . Acc idents happen . Just remember
's okay , Lily . Acc idents happen . It 's
's okay , Fiona . Acc idents happen ." Then they
's okay , Sammy . Acc idents happen . Let 's
to their mom . She cleans their hand and gives them
's okay , Lily . Acc idents happen . Just be

MIN ACTIVATIONS
MIN = -0.293

not bigger than a fing ern ail . They also found
the dove with her fing ern ails . The dove was
M ummy used her fing ern ails to gently pull the
called Lucy . She was seven years old and lived in
grown up until you are seven , like me ." But
he discovered that it was seven little birds sc att tering
and Mary . Tom was seven , and Mary was three
pirate . He sailed the seven seas , looking for treasure
named Beth . She was seven years old . She loved
said , " I picked seven flowers and you ?" The
one by one and counted seven in all . To his
Joe . He sailed the seven seas looking for adventure .
, four trees , and seven flowers .
lots of words and their meanings ". M olly
, five , six , seven , eight . She played
, five , six , seven , eight , nine ,
counted five , six , seven , eight . He even
, five , six , seven , eight , nine ,
Amber had a weight of seven and Bella had a weight
, five , six , seven , eight , nine ,

INTERVAL 0.105 - 0.237
CONTAINS 2.334%

the chair . The cup fell off the table . The
way and eventually Joe was able to find his way home
time , there was a little girl called Amy who had
share . Her friends felt sad and left the party early
time , there was a little boy named Tim my .

INTERVAL -0.028 - 0.105
CONTAINS 71.705%

a magic trick . But it didn 't work the first
hand too tightly . She tried to take it off ,
While they were cooking , Lily saw her mom my
was coming in . <|endoftext|> Once upon a time , there
got in the box . " I 'm a famous racer

INTERVAL -0.160 - -0.028
CONTAINS 25.220%

beautiful . <|endoftext|> Mom and Dad took Jack sailing . It
. She also liked to count how many animals there were
, " That 's great ! Be careful next time ."
surprised and happy for the turtle . They cheered and cl
support . She knew that with her Mom 's support ,

INTERVAL -0.293 - -0.160
CONTAINS 0.741%

The thread is too thin for the button . The thread
Jane was so proud of herself for being successful in steering
. Can you do that for me ?" Lu
let 's find something else for me to pinch ."
. They wanted something simple for their new pup . Dad
Token.
Token ID13
Feature activation+3.37e-03

Pos logit contributions
ad+0.045
ern+0.044
alia+0.041
 meanings+0.040
iles+0.038
Neg logit contributions
 divorce-0.043
 million-0.039
 sudden-0.039
burn-0.038
 helicopter-0.037
Token It
Token ID632
Feature activation+2.69e-02

Pos logit contributions
 divorce+0.005
 million+0.004
 sudden+0.004
burn+0.004
 helicopter+0.004
Neg logit contributions
ad-0.006
ern-0.006
alia-0.005
 meanings-0.005
iles-0.005
Token's
Token ID338
Feature activation+5.36e-02

Pos logit contributions
 divorce+0.044
 sudden+0.044
apy+0.040
burn+0.039
 late+0.039
Neg logit contributions
ad-0.038
ern-0.034
et-0.033
 meanings-0.033
alia-0.032
Token okay
Token ID8788
Feature activation+1.30e-01

Pos logit contributions
 divorce+0.063
 sudden+0.055
 million+0.055
burn+0.053
 helicopter+0.052
Neg logit contributions
ad-0.105
ern-0.105
alia-0.098
 meanings-0.096
iles-0.092
Token.
Token ID13
Feature activation+4.14e-03

Pos logit contributions
 divorce+0.202
 sudden+0.186
 million+0.185
burn+0.180
 helicopter+0.176
Neg logit contributions
ad-0.204
ern-0.201
alia-0.185
 meanings-0.181
iles-0.174
Token Acc
Token ID6366
Feature activation+2.37e-01

Pos logit contributions
 divorce+0.006
 sudden+0.006
 million+0.006
burn+0.005
 helicopter+0.005
Neg logit contributions
ad-0.007
ern-0.007
alia-0.006
 meanings-0.006
 habitats-0.006
Tokenidents
Token ID3231
Feature activation+4.67e-02

Pos logit contributions
 divorce+0.380
 million+0.367
 sudden+0.349
 helicopter+0.324
burn+0.320
Neg logit contributions
ern-0.351
ad-0.349
 meanings-0.330
alia-0.319
artments-0.311
Token happen
Token ID1645
Feature activation+8.38e-02

Pos logit contributions
 divorce+0.073
 million+0.067
 sudden+0.066
burn+0.065
 helicopter+0.064
Neg logit contributions
ad-0.073
ern-0.073
alia-0.068
 meanings-0.065
iles-0.062
Token.
Token ID13
Feature activation+5.17e-03

Pos logit contributions
 divorce+0.119
 million+0.109
 sudden+0.108
burn+0.105
 late+0.104
Neg logit contributions
ad-0.143
ern-0.142
alia-0.133
 meanings-0.128
iles-0.122
Token Just
Token ID2329
Feature activation+4.70e-02

Pos logit contributions
 divorce+0.007
 sudden+0.006
 million+0.006
burn+0.006
 helicopter+0.006
Neg logit contributions
ad-0.009
ern-0.009
alia-0.008
 meanings-0.008
 habitats-0.008
Token be
Token ID307
Feature activation-9.32e-02

Pos logit contributions
 divorce+0.074
 sudden+0.067
burn+0.067
 million+0.067
 helicopter+0.065
Neg logit contributions
ad-0.072
ern-0.071
alia-0.068
 meanings-0.065
io-0.062
Token "
Token ID366
Feature activation+2.17e-02

Pos logit contributions
 divorce+0.016
 sudden+0.015
 million+0.015
burn+0.014
 helicopter+0.014
Neg logit contributions
ern-0.014
ad-0.013
alia-0.012
 meanings-0.012
 Lucas-0.011
TokenIt
Token ID1026
Feature activation+5.48e-02

Pos logit contributions
 divorce+0.031
 million+0.028
 sudden+0.028
 helicopter+0.027
burn+0.027
Neg logit contributions
ad-0.037
ern-0.037
alia-0.035
 meanings-0.033
iles-0.031
Token's
Token ID338
Feature activation+5.01e-02

Pos logit contributions
 divorce+0.074
 sudden+0.071
 million+0.066
burn+0.064
 helicopter+0.062
Neg logit contributions
ad-0.098
ern-0.095
alia-0.092
 meanings-0.089
 Lucas-0.084
Token OK
Token ID7477
Feature activation+8.54e-02

Pos logit contributions
 divorce+0.035
 late+0.028
 million+0.028
burn+0.027
 sudden+0.027
Neg logit contributions
ad-0.122
ern-0.122
alia-0.117
 meanings-0.113
 habitats-0.111
Token.
Token ID13
Feature activation+2.57e-03

Pos logit contributions
 divorce+0.113
 million+0.102
 sudden+0.101
burn+0.099
 helicopter+0.096
Neg logit contributions
ad-0.154
ern-0.153
alia-0.144
 meanings-0.140
iles-0.134
Token Acc
Token ID6366
Feature activation+2.35e-01

Pos logit contributions
 divorce+0.004
 sudden+0.003
 million+0.003
burn+0.003
 helicopter+0.003
Neg logit contributions
ad-0.004
ern-0.004
alia-0.004
 meanings-0.004
 habitats-0.004
Tokenidents
Token ID3231
Feature activation+4.58e-02

Pos logit contributions
 divorce+0.365
 million+0.350
 sudden+0.338
 helicopter+0.315
burn+0.309
Neg logit contributions
ern-0.360
ad-0.355
 meanings-0.336
alia-0.330
artments-0.317
Token happen
Token ID1645
Feature activation+8.04e-02

Pos logit contributions
 divorce+0.067
 million+0.060
 late+0.058
 helicopter+0.058
 sudden+0.057
Neg logit contributions
ern-0.077
ad-0.076
alia-0.074
 meanings-0.069
 habitats-0.067
Token.
Token ID13
Feature activation+4.04e-03

Pos logit contributions
 divorce+0.105
 late+0.096
 million+0.095
 sudden+0.092
burn+0.091
Neg logit contributions
ad-0.146
ern-0.146
alia-0.135
 meanings-0.131
 habitats-0.125
Token I
Token ID314
Feature activation-2.89e-02

Pos logit contributions
 divorce+0.006
 sudden+0.005
 million+0.005
burn+0.005
 helicopter+0.005
Neg logit contributions
ad-0.007
ern-0.007
alia-0.007
 meanings-0.006
 habitats-0.006
Token'm
Token ID1101
Feature activation+3.64e-02

Pos logit contributions
ad+0.050
ern+0.050
alia+0.047
 meanings+0.045
iles+0.044
Neg logit contributions
 divorce-0.041
 million-0.036
 sudden-0.036
burn-0.035
 late-0.034
Token's
Token ID338
Feature activation+5.22e-02

Pos logit contributions
 sudden+0.097
 divorce+0.093
apy+0.084
burn+0.083
 million+0.082
Neg logit contributions
ad-0.083
ern-0.075
 meanings-0.075
 count-0.072
alia-0.071
Token okay
Token ID8788
Feature activation+1.26e-01

Pos logit contributions
 divorce+0.048
 sudden+0.044
 million+0.043
 helicopter+0.039
burn+0.038
Neg logit contributions
ad-0.115
ern-0.115
 meanings-0.108
alia-0.105
et-0.103
Token,
Token ID11
Feature activation+9.55e-03

Pos logit contributions
 divorce+0.154
 million+0.137
 sudden+0.135
burn+0.132
 helicopter+0.130
Neg logit contributions
ern-0.239
ad-0.238
alia-0.226
 meanings-0.218
 habitats-0.211
Token Tom
Token ID4186
Feature activation-5.19e-02

Pos logit contributions
 divorce+0.017
 million+0.016
 sudden+0.016
burn+0.015
 helicopter+0.015
Neg logit contributions
ad-0.013
ern-0.013
alia-0.012
 meanings-0.011
iles-0.011
Token.
Token ID13
Feature activation+5.43e-03

Pos logit contributions
ad+0.085
ern+0.084
alia+0.079
 meanings+0.075
iles+0.072
Neg logit contributions
 divorce-0.078
 sudden-0.071
 million-0.070
burn-0.069
 helicopter-0.067
Token Acc
Token ID6366
Feature activation+2.35e-01

Pos logit contributions
 divorce+0.008
 sudden+0.007
 million+0.007
burn+0.007
 helicopter+0.007
Neg logit contributions
ad-0.009
ern-0.009
alia-0.008
 meanings-0.008
iles-0.008
Tokenidents
Token ID3231
Feature activation+4.50e-02

Pos logit contributions
 divorce+0.372
 million+0.359
 sudden+0.342
 helicopter+0.326
burn+0.316
Neg logit contributions
ern-0.350
ad-0.350
 meanings-0.327
alia-0.321
artments-0.308
Token happen
Token ID1645
Feature activation+8.49e-02

Pos logit contributions
 divorce+0.073
 sudden+0.067
 million+0.067
burn+0.066
 helicopter+0.064
Neg logit contributions
ad-0.068
ern-0.067
alia-0.061
 meanings-0.060
iles-0.057
Token.
Token ID13
Feature activation+6.21e-03

Pos logit contributions
 divorce+0.121
 million+0.111
 sudden+0.109
 late+0.107
burn+0.105
Neg logit contributions
ad-0.145
ern-0.144
alia-0.135
 meanings-0.129
 habitats-0.125
Token I
Token ID314
Feature activation-2.70e-02

Pos logit contributions
 divorce+0.009
 sudden+0.008
 million+0.008
burn+0.008
 helicopter+0.008
Neg logit contributions
ad-0.011
ern-0.010
alia-0.010
 meanings-0.010
 habitats-0.009
Token'm
Token ID1101
Feature activation+3.78e-02

Pos logit contributions
ad+0.046
ern+0.046
alia+0.043
 meanings+0.042
iles+0.039
Neg logit contributions
 divorce-0.038
 million-0.035
 sudden-0.035
burn-0.034
 helicopter-0.034
Token't
Token ID470
Feature activation+6.78e-02

Pos logit contributions
 divorce+0.088
 sudden+0.079
 million+0.079
burn+0.076
 helicopter+0.073
Neg logit contributions
ad-0.143
ern-0.139
alia-0.131
 meanings-0.125
iles-0.121
Token cry
Token ID3960
Feature activation+1.28e-01

Pos logit contributions
 divorce+0.097
 million+0.089
 sudden+0.088
burn+0.084
 helicopter+0.083
Neg logit contributions
ad-0.118
ern-0.114
alia-0.104
 meanings-0.100
iles-0.097
Token,
Token ID11
Feature activation+1.09e-02

Pos logit contributions
 divorce+0.196
 sudden+0.183
 million+0.182
 helicopter+0.171
burn+0.164
Neg logit contributions
ad-0.199
ern-0.198
alia-0.182
 meanings-0.170
iles-0.165
Token boys
Token ID6510
Feature activation-6.55e-03

Pos logit contributions
 divorce+0.014
 sudden+0.013
 million+0.013
burn+0.012
broken+0.012
Neg logit contributions
ad-0.020
ern-0.019
alia-0.017
 meanings-0.017
iles-0.017
Token.
Token ID13
Feature activation+9.31e-03

Pos logit contributions
ad+0.011
ern+0.011
alia+0.010
 meanings+0.010
iles+0.010
Neg logit contributions
 divorce-0.009
 million-0.008
 sudden-0.008
burn-0.008
 helicopter-0.008
Token Acc
Token ID6366
Feature activation+2.34e-01

Pos logit contributions
 divorce+0.012
 sudden+0.012
burn+0.011
 million+0.011
 helicopter+0.011
Neg logit contributions
ad-0.016
ern-0.016
 meanings-0.015
alia-0.015
 Lucas-0.014
Tokenidents
Token ID3231
Feature activation+4.43e-02

Pos logit contributions
 divorce+0.336
 million+0.315
 sudden+0.305
 helicopter+0.290
burn+0.289
Neg logit contributions
ad-0.391
ern-0.386
alia-0.364
 meanings-0.356
io-0.337
Token happen
Token ID1645
Feature activation+8.12e-02

Pos logit contributions
 divorce+0.073
 sudden+0.068
 million+0.066
burn+0.066
 helicopter+0.064
Neg logit contributions
ad-0.066
ern-0.065
alia-0.059
 meanings-0.057
iles-0.055
Token.
Token ID13
Feature activation+1.08e-02

Pos logit contributions
 divorce+0.119
 million+0.109
 sudden+0.108
burn+0.106
 helicopter+0.104
Neg logit contributions
ad-0.134
ern-0.133
alia-0.124
 meanings-0.120
iles-0.115
Token I
Token ID314
Feature activation-3.32e-02

Pos logit contributions
 divorce+0.016
 sudden+0.015
burn+0.014
 million+0.014
 helicopter+0.014
Neg logit contributions
ad-0.018
ern-0.017
 meanings-0.016
alia-0.016
 Lucas-0.015
Token have
Token ID423
Feature activation-9.46e-04

Pos logit contributions
ad+0.052
ern+0.050
alia+0.047
 meanings+0.046
et+0.044
Neg logit contributions
 divorce-0.047
 million-0.046
 helicopter-0.046
 sudden-0.045
burn-0.043
Token It
Token ID632
Feature activation+2.60e-02

Pos logit contributions
 divorce+0.007
 million+0.006
 sudden+0.006
burn+0.006
 helicopter+0.006
Neg logit contributions
ad-0.008
ern-0.008
alia-0.007
 meanings-0.007
iles-0.007
Token was
Token ID373
Feature activation+2.19e-02

Pos logit contributions
 sudden+0.043
 divorce+0.042
burn+0.039
apy+0.039
 million+0.038
Neg logit contributions
ad-0.036
ern-0.035
 meanings-0.033
et-0.033
alia-0.032
Token an
Token ID281
Feature activation-4.71e-04

Pos logit contributions
 divorce+0.026
 late+0.024
 million+0.023
 sudden+0.023
burn+0.023
Neg logit contributions
ad-0.043
ern-0.042
alia-0.040
 meanings-0.039
 habitats-0.038
Token accident
Token ID5778
Feature activation+1.19e-01

Pos logit contributions
ern+0.001
ad+0.001
alia+0.001
 meanings+0.001
 habitats+0.001
Neg logit contributions
 divorce-0.001
 million-0.001
 sudden-0.001
 helicopter-0.001
burn-0.001
Token.
Token ID13
Feature activation+4.30e-03

Pos logit contributions
 divorce+0.215
 sudden+0.204
 million+0.198
 helicopter+0.188
burn+0.188
Neg logit contributions
ad-0.160
ern-0.158
alia-0.143
 meanings-0.134
et-0.133
Token Acc
Token ID6366
Feature activation+2.33e-01

Pos logit contributions
 divorce+0.007
 sudden+0.007
 million+0.006
burn+0.006
 helicopter+0.006
Neg logit contributions
ad-0.007
ern-0.007
alia-0.006
 meanings-0.006
 habitats-0.005
Tokenidents
Token ID3231
Feature activation+4.66e-02

Pos logit contributions
 divorce+0.374
 million+0.360
 sudden+0.342
 helicopter+0.324
burn+0.319
Neg logit contributions
ern-0.348
ad-0.339
 meanings-0.324
alia-0.317
artments-0.304
Token happen
Token ID1645
Feature activation+8.34e-02

Pos logit contributions
 divorce+0.073
 million+0.066
 sudden+0.066
burn+0.064
 helicopter+0.063
Neg logit contributions
ad-0.073
ern-0.073
alia-0.068
 meanings-0.065
iles-0.062
Token.
Token ID13
Feature activation+5.91e-03

Pos logit contributions
 divorce+0.120
 million+0.110
 sudden+0.108
 late+0.106
burn+0.106
Neg logit contributions
ad-0.140
ern-0.139
alia-0.130
 meanings-0.125
 habitats-0.119
Token I
Token ID314
Feature activation-3.02e-02

Pos logit contributions
 divorce+0.009
 sudden+0.008
 million+0.008
burn+0.008
 helicopter+0.007
Neg logit contributions
ad-0.010
ern-0.010
alia-0.009
 meanings-0.009
 habitats-0.008
Token'm
Token ID1101
Feature activation+3.74e-02

Pos logit contributions
ad+0.047
ern+0.046
alia+0.043
 meanings+0.042
 facts+0.038
Neg logit contributions
 divorce-0.045
 sudden-0.044
 million-0.043
 helicopter-0.043
burn-0.042
Token's
Token ID338
Feature activation+4.63e-02

Pos logit contributions
 sudden+0.087
 divorce+0.085
 million+0.077
 late+0.077
 helicopter+0.076
Neg logit contributions
ad-0.077
ern-0.072
 meanings-0.071
alia-0.069
 answers-0.067
Token okay
Token ID8788
Feature activation+1.24e-01

Pos logit contributions
 divorce+0.038
 million+0.032
 sudden+0.032
 helicopter+0.031
burn+0.029
Neg logit contributions
ern-0.107
ad-0.107
alia-0.100
 meanings-0.100
et-0.096
Token,
Token ID11
Feature activation+7.52e-03

Pos logit contributions
 divorce+0.169
 million+0.152
 sudden+0.151
burn+0.148
 helicopter+0.145
Neg logit contributions
ad-0.220
ern-0.218
alia-0.204
 meanings-0.199
iles-0.191
Token Anna
Token ID11735
Feature activation-1.77e-02

Pos logit contributions
 divorce+0.012
 sudden+0.011
 million+0.011
burn+0.011
 helicopter+0.010
Neg logit contributions
ad-0.012
ern-0.012
alia-0.011
 meanings-0.010
iles-0.010
Token.
Token ID13
Feature activation+1.40e-03

Pos logit contributions
ad+0.024
ern+0.023
alia+0.022
 meanings+0.021
et+0.020
Neg logit contributions
 divorce-0.032
 sudden-0.030
 million-0.029
burn-0.028
 helicopter-0.028
Token Acc
Token ID6366
Feature activation+2.33e-01

Pos logit contributions
 divorce+0.002
 sudden+0.002
 million+0.002
burn+0.002
 helicopter+0.002
Neg logit contributions
ad-0.002
ern-0.002
alia-0.002
 meanings-0.002
 habitats-0.002
Tokenidents
Token ID3231
Feature activation+4.21e-02

Pos logit contributions
 divorce+0.380
 million+0.366
 sudden+0.349
 helicopter+0.335
burn+0.322
Neg logit contributions
ern-0.338
ad-0.331
 meanings-0.312
alia-0.304
artments-0.296
Token happen
Token ID1645
Feature activation+8.12e-02

Pos logit contributions
 divorce+0.071
 sudden+0.066
 million+0.065
burn+0.065
 helicopter+0.063
Neg logit contributions
ad-0.061
ern-0.060
alia-0.054
 meanings-0.053
iles-0.051
Token.
Token ID13
Feature activation+2.90e-03

Pos logit contributions
 divorce+0.118
 million+0.108
 sudden+0.107
burn+0.105
 late+0.104
Neg logit contributions
ad-0.135
ern-0.134
alia-0.125
 meanings-0.121
 habitats-0.115
Token I
Token ID314
Feature activation-3.18e-02

Pos logit contributions
 divorce+0.004
 million+0.004
 sudden+0.004
burn+0.004
 helicopter+0.004
Neg logit contributions
ad-0.005
ern-0.005
alia-0.004
 meanings-0.004
 habitats-0.004
Token'm
Token ID1101
Feature activation+3.41e-02

Pos logit contributions
ad+0.053
ern+0.053
alia+0.049
 meanings+0.048
 habitats+0.045
Neg logit contributions
 divorce-0.045
 million-0.042
 sudden-0.042
 helicopter-0.041
burn-0.041
Token.
Token ID13
Feature activation-6.33e-03

Pos logit contributions
ern+0.097
ad+0.096
alia+0.089
 meanings+0.087
 habitats+0.083
Neg logit contributions
 divorce-0.082
 million-0.076
burn-0.074
 sudden-0.073
 late-0.072
Token It
Token ID632
Feature activation+1.83e-02

Pos logit contributions
ad+0.009
ern+0.009
alia+0.008
 meanings+0.008
 Lucas+0.008
Neg logit contributions
 divorce-0.011
 sudden-0.010
 million-0.010
burn-0.009
 helicopter-0.009
Token is
Token ID318
Feature activation-6.13e-02

Pos logit contributions
 sudden+0.031
 divorce+0.031
 late+0.027
apy+0.027
burn+0.026
Neg logit contributions
ad-0.025
et-0.023
ern-0.023
 meanings-0.022
alia-0.022
Token okay
Token ID8788
Feature activation+1.21e-01

Pos logit contributions
ad+0.130
ern+0.129
alia+0.122
 meanings+0.119
iles+0.115
Neg logit contributions
 divorce-0.063
 million-0.054
 sudden-0.054
burn-0.052
 helicopter-0.051
Token.
Token ID13
Feature activation-4.39e-03

Pos logit contributions
 divorce+0.192
 sudden+0.177
 million+0.177
burn+0.171
 helicopter+0.168
Neg logit contributions
ad-0.187
ern-0.184
alia-0.169
 meanings-0.166
iles-0.158
Token Acc
Token ID6366
Feature activation+2.32e-01

Pos logit contributions
ad+0.007
ern+0.006
 meanings+0.006
alia+0.006
artments+0.005
Neg logit contributions
 divorce-0.007
 sudden-0.007
 million-0.007
burn-0.006
 helicopter-0.006
Tokenidents
Token ID3231
Feature activation+4.35e-02

Pos logit contributions
 divorce+0.370
 million+0.352
 sudden+0.337
 helicopter+0.316
burn+0.315
Neg logit contributions
ad-0.344
ern-0.342
 meanings-0.320
alia-0.317
artments-0.304
Token happen
Token ID1645
Feature activation+7.89e-02

Pos logit contributions
 divorce+0.065
 million+0.059
 sudden+0.057
burn+0.057
 helicopter+0.057
Neg logit contributions
ad-0.071
ern-0.071
alia-0.067
 meanings-0.064
 habitats-0.061
Token.
Token ID13
Feature activation-2.09e-03

Pos logit contributions
 divorce+0.117
 million+0.107
 sudden+0.106
burn+0.104
 helicopter+0.102
Neg logit contributions
ad-0.129
ern-0.128
alia-0.120
 meanings-0.116
iles-0.111
Token Let
Token ID3914
Feature activation+1.14e-01

Pos logit contributions
ad+0.003
ern+0.003
 meanings+0.003
alia+0.003
artments+0.003
Neg logit contributions
 divorce-0.003
 sudden-0.003
 million-0.003
burn-0.003
 helicopter-0.003
Token me
Token ID502
Feature activation-1.18e-01

Pos logit contributions
apy+0.203
 divorce+0.200
bul+0.195
 million+0.191
 sudden+0.190
Neg logit contributions
ad-0.140
et-0.129
ern-0.122
 herself-0.117
 me-0.116
Token are
Token ID389
Feature activation-5.68e-02

Pos logit contributions
 divorce+0.108
 million+0.098
burn+0.098
 sudden+0.096
 helicopter+0.094
Neg logit contributions
ern-0.123
ad-0.122
alia-0.114
 meanings-0.109
iles-0.106
Token proud
Token ID6613
Feature activation+1.15e-02

Pos logit contributions
ad+0.102
ern+0.100
alia+0.094
 meanings+0.091
iles+0.089
Neg logit contributions
 divorce-0.074
 sudden-0.066
 million-0.066
burn-0.063
 helicopter-0.060
Token of
Token ID286
Feature activation+6.48e-02

Pos logit contributions
 divorce+0.014
 sudden+0.012
burn+0.012
 million+0.011
 helicopter+0.011
Neg logit contributions
ad-0.023
ern-0.021
alia-0.020
iles-0.020
 meanings-0.020
Token you
Token ID345
Feature activation-1.99e-02

Pos logit contributions
 divorce+0.073
 million+0.065
 sudden+0.064
 helicopter+0.063
burn+0.062
Neg logit contributions
ern-0.130
ad-0.128
alia-0.125
 meanings-0.119
artments-0.114
Token.
Token ID13
Feature activation+8.04e-03

Pos logit contributions
ad+0.035
ern+0.035
alia+0.033
 meanings+0.032
iles+0.030
Neg logit contributions
 divorce-0.027
 sudden-0.025
 million-0.024
burn-0.023
 helicopter-0.023
Token Acc
Token ID6366
Feature activation+2.31e-01

Pos logit contributions
 divorce+0.011
 sudden+0.010
 million+0.010
broken+0.010
burn+0.010
Neg logit contributions
ern-0.014
ad-0.014
 meanings-0.013
alia-0.013
artments-0.012
Tokenidents
Token ID3231
Feature activation+4.27e-02

Pos logit contributions
 divorce+0.367
 million+0.354
 sudden+0.334
 late+0.315
 helicopter+0.312
Neg logit contributions
ad-0.346
ern-0.341
 meanings-0.326
alia-0.324
 habitats-0.304
Token happen
Token ID1645
Feature activation+7.90e-02

Pos logit contributions
 divorce+0.066
 million+0.061
 sudden+0.060
burn+0.059
 helicopter+0.058
Neg logit contributions
ad-0.067
ern-0.067
alia-0.062
 meanings-0.060
iles-0.057
Token.
Token ID13
Feature activation+9.23e-03

Pos logit contributions
 divorce+0.101
 million+0.091
 sudden+0.090
burn+0.088
 helicopter+0.086
Neg logit contributions
ad-0.146
ern-0.145
alia-0.136
 meanings-0.132
iles-0.127
Token We
Token ID775
Feature activation+7.51e-02

Pos logit contributions
 divorce+0.012
 sudden+0.011
 million+0.010
burn+0.010
broken+0.010
Neg logit contributions
ern-0.017
ad-0.016
 meanings-0.016
alia-0.015
artments-0.015
Token can
Token ID460
Feature activation+1.13e-02

Pos logit contributions
 divorce+0.107
 million+0.096
 sudden+0.095
burn+0.094
 helicopter+0.092
Neg logit contributions
ad-0.128
ern-0.128
alia-0.119
 meanings-0.115
iles-0.111
Token's
Token ID338
Feature activation+4.55e-02

Pos logit contributions
 divorce+0.070
 sudden+0.065
burn+0.060
 late+0.060
 million+0.059
Neg logit contributions
ad-0.070
ern-0.067
alia-0.065
 meanings-0.063
et-0.062
Token okay
Token ID8788
Feature activation+1.19e-01

Pos logit contributions
 divorce+0.035
 million+0.028
 sudden+0.028
burn+0.026
 helicopter+0.026
Neg logit contributions
ad-0.109
ern-0.109
alia-0.103
 meanings-0.101
iles-0.098
Token,
Token ID11
Feature activation+6.97e-03

Pos logit contributions
 divorce+0.185
 million+0.169
 sudden+0.169
burn+0.164
 helicopter+0.162
Neg logit contributions
ad-0.190
ern-0.185
alia-0.173
 meanings-0.168
iles-0.162
Token Lily
Token ID20037
Feature activation+4.06e-02

Pos logit contributions
 divorce+0.010
 million+0.009
 sudden+0.009
burn+0.009
 helicopter+0.009
Neg logit contributions
ad-0.012
ern-0.012
alia-0.011
 meanings-0.011
iles-0.010
Token.
Token ID13
Feature activation+4.00e-03

Pos logit contributions
 divorce+0.067
 sudden+0.061
 million+0.061
burn+0.059
 helicopter+0.058
Neg logit contributions
ad-0.061
ern-0.060
alia-0.056
 meanings-0.053
iles-0.051
Token Acc
Token ID6366
Feature activation+2.30e-01

Pos logit contributions
 divorce+0.005
 million+0.005
 sudden+0.005
burn+0.005
 helicopter+0.005
Neg logit contributions
ad-0.007
ern-0.007
alia-0.007
 meanings-0.006
iles-0.006
Tokenidents
Token ID3231
Feature activation+4.21e-02

Pos logit contributions
 divorce+0.345
 million+0.322
 sudden+0.312
 helicopter+0.293
burn+0.292
Neg logit contributions
ad-0.370
ern-0.369
alia-0.346
 meanings-0.341
io-0.321
Token happen
Token ID1645
Feature activation+8.00e-02

Pos logit contributions
 divorce+0.072
 sudden+0.068
burn+0.066
 million+0.066
 helicopter+0.063
Neg logit contributions
ad-0.060
ern-0.058
 meanings-0.052
alia-0.052
iles-0.050
Token.
Token ID13
Feature activation+5.94e-03

Pos logit contributions
 divorce+0.124
 million+0.113
 sudden+0.113
burn+0.110
 helicopter+0.108
Neg logit contributions
ad-0.126
ern-0.125
alia-0.116
 meanings-0.112
iles-0.108
Token I
Token ID314
Feature activation-3.50e-02

Pos logit contributions
 divorce+0.008
 sudden+0.007
 million+0.007
burn+0.007
 helicopter+0.007
Neg logit contributions
ad-0.011
ern-0.011
alia-0.010
 meanings-0.010
iles-0.009
Token'm
Token ID1101
Feature activation+3.24e-02

Pos logit contributions
ad+0.058
ern+0.057
alia+0.054
 meanings+0.053
 habitats+0.049
Neg logit contributions
 divorce-0.049
 million-0.047
 sudden-0.046
 helicopter-0.045
burn-0.045
Token okay
Token ID8788
Feature activation+1.20e-01

Pos logit contributions
 divorce+0.037
 sudden+0.031
 million+0.031
 helicopter+0.030
burn+0.028
Neg logit contributions
ern-0.101
ad-0.101
alia-0.094
 meanings-0.094
et-0.090
Token,
Token ID11
Feature activation+1.69e-03

Pos logit contributions
 divorce+0.147
 million+0.132
 sudden+0.129
burn+0.127
 helicopter+0.123
Neg logit contributions
ad-0.227
ern-0.226
alia-0.213
 meanings-0.207
 habitats-0.200
Token sweet
Token ID6029
Feature activation-6.31e-02

Pos logit contributions
 divorce+0.003
 million+0.002
 sudden+0.002
burn+0.002
 helicopter+0.002
Neg logit contributions
ad-0.003
ern-0.003
alia-0.003
 meanings-0.002
iles-0.002
Tokenie
Token ID494
Feature activation+5.85e-02

Pos logit contributions
ern+0.111
ad+0.109
 meanings+0.102
alia+0.100
artments+0.099
Neg logit contributions
 divorce-0.083
 million-0.078
burn-0.078
 sudden-0.075
 helicopter-0.073
Token.
Token ID13
Feature activation-5.43e-05

Pos logit contributions
 divorce+0.084
 million+0.078
 sudden+0.076
burn+0.075
 helicopter+0.073
Neg logit contributions
ad-0.098
ern-0.097
alia-0.091
 meanings-0.088
 habitats-0.085
Token Acc
Token ID6366
Feature activation+2.30e-01

Pos logit contributions
ad+0.000
ern+0.000
 meanings+0.000
alia+0.000
artments+0.000
Neg logit contributions
 divorce-0.000
 sudden-0.000
 million-0.000
burn-0.000
 helicopter-0.000
Tokenidents
Token ID3231
Feature activation+3.96e-02

Pos logit contributions
 divorce+0.388
 million+0.377
 sudden+0.355
 helicopter+0.339
 late+0.331
Neg logit contributions
ern-0.320
ad-0.309
 meanings-0.305
alia-0.283
artments-0.283
Token happen
Token ID1645
Feature activation+7.58e-02

Pos logit contributions
 divorce+0.064
 sudden+0.059
 million+0.058
burn+0.057
 helicopter+0.056
Neg logit contributions
ad-0.060
ern-0.060
alia-0.055
 meanings-0.053
iles-0.051
Token.
Token ID13
Feature activation+1.54e-03

Pos logit contributions
 divorce+0.097
 million+0.089
 sudden+0.086
 late+0.086
burn+0.084
Neg logit contributions
ad-0.140
ern-0.138
alia-0.130
 meanings-0.126
 habitats-0.122
Token Let
Token ID3914
Feature activation+1.13e-01

Pos logit contributions
 divorce+0.002
 sudden+0.002
 million+0.002
burn+0.002
 helicopter+0.002
Neg logit contributions
ern-0.002
ad-0.002
 meanings-0.002
alia-0.002
artments-0.002
Token's
Token ID338
Feature activation+4.53e-02

Pos logit contributions
apy+0.195
 divorce+0.193
bul+0.192
 late+0.183
 sudden+0.180
Neg logit contributions
ad-0.133
et-0.126
ern-0.122
 herself-0.117
 me-0.117
Token's
Token ID338
Feature activation+4.29e-02

Pos logit contributions
 sudden+0.092
 divorce+0.090
burn+0.085
apy+0.084
 stret+0.084
Neg logit contributions
ad-0.050
 thinks-0.049
 meanings-0.047
 answers-0.047
 count-0.044
Token okay
Token ID8788
Feature activation+1.19e-01

Pos logit contributions
 divorce+0.050
 sudden+0.046
 helicopter+0.045
 million+0.045
apy+0.043
Neg logit contributions
ad-0.083
ern-0.082
 meanings-0.077
et-0.074
 herself-0.074
Token,
Token ID11
Feature activation+5.74e-03

Pos logit contributions
 divorce+0.212
 million+0.195
 sudden+0.195
 helicopter+0.192
burn+0.187
Neg logit contributions
ad-0.164
ern-0.152
 meanings-0.143
alia-0.136
iles-0.129
Token Lily
Token ID20037
Feature activation+4.46e-02

Pos logit contributions
 divorce+0.008
burn+0.008
 sudden+0.007
 helicopter+0.007
 million+0.007
Neg logit contributions
ad-0.009
ern-0.009
 meanings-0.009
 herself-0.008
 facts-0.008
Token.
Token ID13
Feature activation-3.18e-03

Pos logit contributions
 divorce+0.099
 sudden+0.095
apy+0.089
 helicopter+0.089
 million+0.089
Neg logit contributions
ad-0.038
ern-0.036
alia-0.034
 meanings-0.033
et-0.032
Token Acc
Token ID6366
Feature activation+2.30e-01

Pos logit contributions
ad+0.004
ern+0.004
 meanings+0.004
artments+0.004
 habitats+0.004
Neg logit contributions
 divorce-0.005
 sudden-0.005
 million-0.005
burn-0.005
 helicopter-0.005
Tokenidents
Token ID3231
Feature activation+3.98e-02

Pos logit contributions
 million+0.417
 divorce+0.416
 sudden+0.389
 helicopter+0.372
 severe+0.367
Neg logit contributions
ern-0.271
 meanings-0.263
ad-0.260
artments-0.245
eka-0.240
Token happen
Token ID1645
Feature activation+7.67e-02

Pos logit contributions
 divorce+0.073
 sudden+0.072
burn+0.070
 million+0.068
 helicopter+0.066
Neg logit contributions
ad-0.051
ern-0.049
 meanings-0.043
et-0.041
iles-0.041
Token.
Token ID13
Feature activation-3.18e-04

Pos logit contributions
 divorce+0.126
 sudden+0.115
 million+0.113
burn+0.113
 helicopter+0.111
Neg logit contributions
ad-0.112
ern-0.110
alia-0.103
 meanings-0.101
et-0.100
Token Let
Token ID3914
Feature activation+1.12e-01

Pos logit contributions
ad+0.000
 meanings+0.000
ern+0.000
artments+0.000
 habitats+0.000
Neg logit contributions
 divorce-0.001
 sudden-0.001
 million-0.001
burn-0.000
 helicopter-0.000
Token's
Token ID338
Feature activation+4.55e-02

Pos logit contributions
apy+0.240
bul+0.223
 divorce+0.213
 million+0.210
 disco+0.206
Neg logit contributions
 me-0.110
et-0.096
ad-0.092
 herself-0.087
ima-0.076
Token of
Token ID286
Feature activation+6.45e-02

Pos logit contributions
 divorce+0.001
 late+0.001
 million+0.001
 sudden+0.001
 helicopter+0.001
Neg logit contributions
ern-0.002
ad-0.002
alia-0.002
 meanings-0.002
artments-0.002
Token the
Token ID262
Feature activation+1.55e-02

Pos logit contributions
 divorce+0.079
 million+0.069
 sudden+0.069
burn+0.067
 late+0.066
Neg logit contributions
ern-0.123
ad-0.121
alia-0.115
 meanings-0.115
artments-0.109
Token story
Token ID1621
Feature activation-2.77e-02

Pos logit contributions
 divorce+0.022
 million+0.020
 sudden+0.020
burn+0.019
 helicopter+0.019
Neg logit contributions
ern-0.027
ad-0.027
alia-0.025
 meanings-0.024
artments-0.023
Token is
Token ID318
Feature activation-5.83e-02

Pos logit contributions
ern+0.047
ad+0.046
alia+0.044
 meanings+0.040
eda+0.040
Neg logit contributions
 divorce-0.041
 million-0.037
 late-0.035
 helicopter-0.035
burn-0.034
Token:
Token ID25
Feature activation-1.06e-01

Pos logit contributions
ern+0.077
ad+0.076
alia+0.071
et+0.070
 meanings+0.068
Neg logit contributions
 divorce-0.106
 late-0.096
 band-0.091
 sudden-0.091
 million-0.090
Token Acc
Token ID6366
Feature activation+2.29e-01

Pos logit contributions
ad+0.166
ern+0.163
alia+0.151
 meanings+0.149
iles+0.141
Neg logit contributions
 divorce-0.164
 million-0.152
 sudden-0.150
burn-0.148
 helicopter-0.144
Tokenidents
Token ID3231
Feature activation+3.90e-02

Pos logit contributions
 divorce+0.348
 million+0.327
 sudden+0.313
burn+0.300
 helicopter+0.297
Neg logit contributions
ad-0.360
ern-0.359
 meanings-0.341
alia-0.337
artments-0.320
Token happen
Token ID1645
Feature activation+7.44e-02

Pos logit contributions
 divorce+0.046
 late+0.040
 million+0.040
 little+0.038
 better+0.038
Neg logit contributions
alia-0.075
ern-0.075
ad-0.072
 meanings-0.069
 habitats-0.069
Token,
Token ID11
Feature activation+7.07e-03

Pos logit contributions
 divorce+0.085
 late+0.082
 million+0.079
 together+0.076
 sudden+0.075
Neg logit contributions
ern-0.146
ad-0.143
alia-0.133
 meanings-0.130
 habitats-0.126
Token but
Token ID475
Feature activation-5.01e-02

Pos logit contributions
 divorce+0.010
 million+0.009
 sudden+0.009
burn+0.009
 helicopter+0.009
Neg logit contributions
ad-0.012
ern-0.012
alia-0.011
 meanings-0.011
iles-0.010
Token we
Token ID356
Feature activation+6.92e-02

Pos logit contributions
ern+0.090
ad+0.090
alia+0.085
 meanings+0.081
io+0.078
Neg logit contributions
 divorce-0.067
 sudden-0.060
 million-0.058
burn-0.058
 helicopter-0.057
Token's
Token ID338
Feature activation+4.16e-02

Pos logit contributions
 sudden+0.082
 divorce+0.081
burn+0.077
 million+0.074
apy+0.074
Neg logit contributions
ad-0.053
 meanings-0.050
 thinks-0.049
 answers-0.048
 count-0.047
Token okay
Token ID8788
Feature activation+1.17e-01

Pos logit contributions
 divorce+0.041
 million+0.035
 sudden+0.035
 helicopter+0.035
burn+0.034
Neg logit contributions
ad-0.089
ern-0.089
 meanings-0.083
alia-0.081
et-0.080
Token,
Token ID11
Feature activation+8.57e-03

Pos logit contributions
 divorce+0.189
 million+0.174
 sudden+0.171
 helicopter+0.167
burn+0.166
Neg logit contributions
ad-0.179
ern-0.172
 meanings-0.158
alia-0.155
iles-0.149
Token Lily
Token ID20037
Feature activation+4.32e-02

Pos logit contributions
 divorce+0.011
 sudden+0.010
burn+0.010
 million+0.010
 helicopter+0.009
Neg logit contributions
ad-0.015
ern-0.015
 meanings-0.014
alia-0.013
eda-0.013
Token.
Token ID13
Feature activation+5.39e-04

Pos logit contributions
 divorce+0.089
 sudden+0.084
 million+0.080
 helicopter+0.079
apy+0.078
Neg logit contributions
ad-0.045
ern-0.044
alia-0.041
 meanings-0.038
et-0.038
Token Acc
Token ID6366
Feature activation+2.29e-01

Pos logit contributions
 divorce+0.001
 sudden+0.001
 million+0.001
burn+0.001
 helicopter+0.001
Neg logit contributions
ad-0.001
ern-0.001
 meanings-0.001
alia-0.001
artments-0.001
Tokenidents
Token ID3231
Feature activation+4.01e-02

Pos logit contributions
 divorce+0.363
 million+0.347
 sudden+0.330
 helicopter+0.314
 late+0.307
Neg logit contributions
ern-0.343
ad-0.339
alia-0.317
 meanings-0.317
artments-0.298
Token happen
Token ID1645
Feature activation+7.66e-02

Pos logit contributions
 divorce+0.073
 sudden+0.073
burn+0.070
 million+0.068
 helicopter+0.066
Neg logit contributions
ad-0.051
ern-0.048
 meanings-0.043
et-0.042
iles-0.041
Token.
Token ID13
Feature activation+2.93e-03

Pos logit contributions
 divorce+0.125
 sudden+0.114
 million+0.113
burn+0.113
 helicopter+0.111
Neg logit contributions
ad-0.112
ern-0.110
alia-0.103
 meanings-0.101
et-0.099
Token Let
Token ID3914
Feature activation+1.10e-01

Pos logit contributions
 divorce+0.005
 sudden+0.004
 million+0.004
burn+0.004
 helicopter+0.004
Neg logit contributions
ad-0.004
ern-0.004
 meanings-0.004
alia-0.004
artments-0.004
Token's
Token ID338
Feature activation+4.64e-02

Pos logit contributions
apy+0.224
bul+0.214
 divorce+0.209
 million+0.206
 disco+0.199
Neg logit contributions
 me-0.104
et-0.101
ad-0.095
 herself-0.085
ima-0.076
Token's
Token ID338
Feature activation+3.97e-02

Pos logit contributions
 divorce+0.063
 sudden+0.060
 late+0.055
burn+0.054
 helicopter+0.054
Neg logit contributions
ad-0.082
ern-0.080
alia-0.076
 meanings-0.075
et-0.072
Token okay
Token ID8788
Feature activation+1.19e-01

Pos logit contributions
 divorce+0.026
 million+0.020
 sudden+0.020
burn+0.019
 late+0.018
Neg logit contributions
ad-0.099
ern-0.098
alia-0.095
 meanings-0.092
iles-0.090
Token,
Token ID11
Feature activation+4.19e-03

Pos logit contributions
 divorce+0.148
 million+0.132
burn+0.130
 sudden+0.129
 helicopter+0.124
Neg logit contributions
ad-0.225
ern-0.224
alia-0.212
 meanings-0.205
 habitats-0.198
Token Anna
Token ID11735
Feature activation-2.43e-02

Pos logit contributions
 divorce+0.006
 million+0.006
 sudden+0.005
burn+0.005
 helicopter+0.005
Neg logit contributions
ad-0.007
ern-0.007
alia-0.007
 meanings-0.006
iles-0.006
Token.
Token ID13
Feature activation-6.46e-03

Pos logit contributions
ad+0.035
ern+0.034
alia+0.032
 meanings+0.030
et+0.029
Neg logit contributions
 divorce-0.041
 sudden-0.038
 million-0.037
 helicopter-0.036
burn-0.036
Token Acc
Token ID6366
Feature activation+2.28e-01

Pos logit contributions
ad+0.010
ern+0.010
alia+0.009
 meanings+0.009
 habitats+0.009
Neg logit contributions
 divorce-0.010
 sudden-0.009
 million-0.009
burn-0.009
 helicopter-0.009
Tokenidents
Token ID3231
Feature activation+3.83e-02

Pos logit contributions
 divorce+0.333
 million+0.308
 sudden+0.301
 helicopter+0.286
burn+0.283
Neg logit contributions
ern-0.376
ad-0.375
alia-0.350
 meanings-0.346
artments-0.326
Token happen
Token ID1645
Feature activation+7.61e-02

Pos logit contributions
 divorce+0.058
 million+0.053
 sudden+0.052
burn+0.051
 helicopter+0.051
Neg logit contributions
ad-0.062
ern-0.061
alia-0.058
 meanings-0.055
 habitats-0.053
Token.
Token ID13
Feature activation-3.11e-03

Pos logit contributions
 divorce+0.110
 million+0.100
 sudden+0.100
burn+0.097
 helicopter+0.095
Neg logit contributions
ad-0.128
ern-0.127
alia-0.118
 meanings-0.114
iles-0.109
Token You
Token ID921
Feature activation-3.08e-02

Pos logit contributions
ad+0.005
ern+0.005
 meanings+0.005
alia+0.005
 habitats+0.005
Neg logit contributions
 divorce-0.004
 sudden-0.004
 million-0.004
burn-0.004
 helicopter-0.004
Token don
Token ID836
Feature activation+2.39e-02

Pos logit contributions
ad+0.039
 meanings+0.033
 count+0.032
ern+0.032
 answers+0.031
Neg logit contributions
 sudden-0.056
burn-0.054
 divorce-0.054
 million-0.052
apy-0.051
Token's
Token ID338
Feature activation+4.09e-02

Pos logit contributions
 sudden+0.086
 divorce+0.084
burn+0.077
 million+0.077
apy+0.077
Neg logit contributions
ad-0.052
 meanings-0.048
 thinks-0.048
 answers-0.047
 count-0.046
Token okay
Token ID8788
Feature activation+1.17e-01

Pos logit contributions
 divorce+0.045
 helicopter+0.040
 million+0.040
 sudden+0.039
burn+0.036
Neg logit contributions
ad-0.083
ern-0.083
 meanings-0.076
et-0.074
alia-0.074
Token,
Token ID11
Feature activation+3.74e-03

Pos logit contributions
 divorce+0.193
 million+0.179
 sudden+0.175
 helicopter+0.172
burn+0.170
Neg logit contributions
ad-0.174
ern-0.166
 meanings-0.153
alia-0.150
iley-0.140
Token Lily
Token ID20037
Feature activation+4.24e-02

Pos logit contributions
 divorce+0.005
burn+0.005
 sudden+0.005
 helicopter+0.005
 million+0.004
Neg logit contributions
ad-0.006
ern-0.006
 meanings-0.006
 herself-0.005
 facts-0.005
Token.
Token ID13
Feature activation-4.34e-03

Pos logit contributions
 divorce+0.089
 sudden+0.085
 million+0.080
 helicopter+0.080
apy+0.079
Neg logit contributions
ad-0.042
ern-0.041
alia-0.037
 meanings-0.036
et-0.035
Token Acc
Token ID6366
Feature activation+2.27e-01

Pos logit contributions
ad+0.006
ern+0.006
 meanings+0.006
artments+0.005
alia+0.005
Neg logit contributions
 divorce-0.007
 sudden-0.007
 million-0.007
burn-0.007
 helicopter-0.006
Tokenidents
Token ID3231
Feature activation+3.74e-02

Pos logit contributions
 divorce+0.396
 million+0.389
 sudden+0.367
 helicopter+0.346
 severe+0.342
Neg logit contributions
ern-0.294
ad-0.289
 meanings-0.284
artments-0.262
eka-0.256
Token happen
Token ID1645
Feature activation+7.36e-02

Pos logit contributions
 divorce+0.066
 sudden+0.063
burn+0.061
 million+0.061
 helicopter+0.059
Neg logit contributions
ad-0.051
ern-0.050
 meanings-0.044
alia-0.042
et-0.042
Token.
Token ID13
Feature activation-1.77e-03

Pos logit contributions
 divorce+0.120
 sudden+0.109
 million+0.108
burn+0.107
 helicopter+0.105
Neg logit contributions
ad-0.109
ern-0.108
alia-0.101
 meanings-0.099
et-0.096
Token Just
Token ID2329
Feature activation+3.29e-02

Pos logit contributions
ad+0.002
ern+0.002
 meanings+0.002
artments+0.002
 habitats+0.002
Neg logit contributions
 divorce-0.003
 sudden-0.003
 million-0.003
burn-0.003
 helicopter-0.003
Token remember
Token ID3505
Feature activation-4.26e-02

Pos logit contributions
 divorce+0.055
 million+0.051
 sudden+0.050
burn+0.050
 helicopter+0.049
Neg logit contributions
ad-0.048
ern-0.047
alia-0.044
 meanings-0.042
iles-0.040
Token's
Token ID338
Feature activation+4.07e-02

Pos logit contributions
 sudden+0.074
 divorce+0.073
burn+0.068
 helicopter+0.066
 late+0.065
Neg logit contributions
ad-0.064
 meanings-0.062
ern-0.058
 answers-0.058
alia-0.058
Token okay
Token ID8788
Feature activation+1.17e-01

Pos logit contributions
 divorce+0.036
 sudden+0.031
 million+0.030
 helicopter+0.030
burn+0.029
Neg logit contributions
ad-0.092
ern-0.091
 meanings-0.086
alia-0.085
et-0.082
Token,
Token ID11
Feature activation+5.32e-03

Pos logit contributions
 divorce+0.177
 million+0.162
 sudden+0.162
 helicopter+0.156
burn+0.156
Neg logit contributions
ad-0.190
ern-0.186
alia-0.172
 meanings-0.170
iles-0.159
Token Lily
Token ID20037
Feature activation+4.14e-02

Pos logit contributions
 divorce+0.006
 sudden+0.006
burn+0.006
 million+0.006
 helicopter+0.005
Neg logit contributions
ad-0.010
ern-0.010
 meanings-0.009
alia-0.009
 facts-0.009
Token.
Token ID13
Feature activation-3.26e-03

Pos logit contributions
 divorce+0.080
 sudden+0.076
 million+0.072
 helicopter+0.072
apy+0.070
Neg logit contributions
ad-0.049
ern-0.048
alia-0.045
 meanings-0.043
et-0.040
Token Acc
Token ID6366
Feature activation+2.27e-01

Pos logit contributions
ad+0.005
ern+0.005
 meanings+0.005
alia+0.004
artments+0.004
Neg logit contributions
 divorce-0.005
 sudden-0.005
 million-0.005
burn-0.005
 helicopter-0.005
Tokenidents
Token ID3231
Feature activation+3.74e-02

Pos logit contributions
 divorce+0.357
 million+0.339
 sudden+0.327
 helicopter+0.311
burn+0.301
Neg logit contributions
ern-0.344
ad-0.340
 meanings-0.320
alia-0.317
artments-0.300
Token happen
Token ID1645
Feature activation+7.65e-02

Pos logit contributions
 divorce+0.064
 sudden+0.062
burn+0.060
 million+0.059
 helicopter+0.057
Neg logit contributions
ad-0.053
ern-0.051
 meanings-0.046
alia-0.045
iles-0.043
Token.
Token ID13
Feature activation-6.66e-04

Pos logit contributions
 divorce+0.118
 sudden+0.107
 million+0.106
burn+0.105
 helicopter+0.103
Neg logit contributions
ad-0.121
ern-0.119
alia-0.112
 meanings-0.109
iles-0.105
Token It
Token ID632
Feature activation+1.77e-02

Pos logit contributions
ad+0.001
ern+0.001
 meanings+0.001
alia+0.001
artments+0.001
Neg logit contributions
 divorce-0.001
 sudden-0.001
 million-0.001
burn-0.001
 helicopter-0.001
Token's
Token ID338
Feature activation+4.56e-02

Pos logit contributions
 sudden+0.033
apy+0.030
 divorce+0.030
burn+0.028
 million+0.028
Neg logit contributions
 meanings-0.020
 answers-0.020
ad-0.020
et-0.020
 thinks-0.019
Token's
Token ID338
Feature activation+4.12e-02

Pos logit contributions
 divorce+0.066
 sudden+0.064
burn+0.060
 million+0.058
 late+0.057
Neg logit contributions
ad-0.084
ern-0.080
 meanings-0.078
alia-0.078
 Lucas-0.073
Token okay
Token ID8788
Feature activation+1.17e-01

Pos logit contributions
 divorce+0.030
 million+0.025
 sudden+0.024
 late+0.023
burn+0.023
Neg logit contributions
ad-0.099
ern-0.099
alia-0.095
 meanings-0.092
iles-0.090
Token,
Token ID11
Feature activation+8.49e-03

Pos logit contributions
 divorce+0.142
 million+0.127
 sudden+0.126
burn+0.124
 helicopter+0.119
Neg logit contributions
ad-0.223
ern-0.221
alia-0.210
 meanings-0.203
iles-0.196
Token Fiona
Token ID42207
Feature activation-6.22e-02

Pos logit contributions
 divorce+0.012
 million+0.011
 sudden+0.011
burn+0.011
 helicopter+0.011
Neg logit contributions
ad-0.014
ern-0.014
alia-0.013
 meanings-0.013
iles-0.012
Token.
Token ID13
Feature activation+4.58e-04

Pos logit contributions
ad+0.109
ern+0.108
alia+0.101
 meanings+0.099
iles+0.095
Neg logit contributions
 divorce-0.085
 million-0.077
 sudden-0.075
burn-0.074
 helicopter-0.073
Token Acc
Token ID6366
Feature activation+2.27e-01

Pos logit contributions
 divorce+0.001
 sudden+0.001
 million+0.001
burn+0.001
 helicopter+0.001
Neg logit contributions
ad-0.001
ern-0.001
alia-0.001
 meanings-0.001
 habitats-0.001
Tokenidents
Token ID3231
Feature activation+3.64e-02

Pos logit contributions
 divorce+0.348
 million+0.330
 sudden+0.315
 helicopter+0.301
burn+0.298
Neg logit contributions
ern-0.357
ad-0.351
 meanings-0.326
alia-0.322
artments-0.310
Token happen
Token ID1645
Feature activation+7.00e-02

Pos logit contributions
 divorce+0.052
 million+0.047
 helicopter+0.045
 late+0.045
 sudden+0.045
Neg logit contributions
ern-0.061
ad-0.061
alia-0.060
 meanings-0.055
 habitats-0.054
Token."
Token ID526
Feature activation+3.19e-02

Pos logit contributions
 divorce+0.094
 million+0.086
 sudden+0.084
 late+0.082
burn+0.082
Neg logit contributions
ad-0.125
ern-0.124
alia-0.116
 meanings-0.112
 habitats-0.107
Token Then
Token ID3244
Feature activation+2.85e-02

Pos logit contributions
 divorce+0.041
burn+0.034
 million+0.034
 helicopter+0.034
 sudden+0.033
Neg logit contributions
ern-0.062
ad-0.060
alia-0.057
 meanings-0.054
 Lucas-0.052
Token they
Token ID484
Feature activation+1.08e-01

Pos logit contributions
 divorce+0.034
 sudden+0.031
 million+0.031
burn+0.030
 helicopter+0.029
Neg logit contributions
ad-0.055
ern-0.054
alia-0.051
 meanings-0.050
iles-0.049
Token's
Token ID338
Feature activation+3.88e-02

Pos logit contributions
 divorce+0.063
 sudden+0.062
burn+0.056
apy+0.054
 late+0.054
Neg logit contributions
ad-0.061
et-0.055
 meanings-0.055
ern-0.054
alia-0.053
Token okay
Token ID8788
Feature activation+1.09e-01

Pos logit contributions
 divorce+0.034
 sudden+0.029
 million+0.029
burn+0.027
 helicopter+0.026
Neg logit contributions
ad-0.088
ern-0.087
alia-0.082
 meanings-0.082
iles-0.079
Token,
Token ID11
Feature activation+3.14e-03

Pos logit contributions
 divorce+0.121
 million+0.107
 sudden+0.104
burn+0.103
 helicopter+0.102
Neg logit contributions
ern-0.220
ad-0.218
alia-0.209
 meanings-0.201
 habitats-0.196
Token Sammy
Token ID44067
Feature activation-1.32e-03

Pos logit contributions
 divorce+0.004
 sudden+0.004
 million+0.004
burn+0.004
 late+0.004
Neg logit contributions
ad-0.006
ern-0.005
 meanings-0.005
alia-0.005
iles-0.005
Token.
Token ID13
Feature activation+6.57e-03

Pos logit contributions
ad+0.002
ern+0.002
alia+0.002
 meanings+0.002
iles+0.002
Neg logit contributions
 divorce-0.002
 sudden-0.002
 million-0.002
burn-0.002
 helicopter-0.002
Token Acc
Token ID6366
Feature activation+2.27e-01

Pos logit contributions
 divorce+0.009
 sudden+0.009
 million+0.009
burn+0.008
 helicopter+0.008
Neg logit contributions
ad-0.011
ern-0.011
 meanings-0.010
alia-0.010
 habitats-0.010
Tokenidents
Token ID3231
Feature activation+3.52e-02

Pos logit contributions
 divorce+0.327
 million+0.303
 sudden+0.294
burn+0.279
 helicopter+0.278
Neg logit contributions
ad-0.379
ern-0.374
alia-0.351
 meanings-0.347
artments-0.325
Token happen
Token ID1645
Feature activation+7.29e-02

Pos logit contributions
 divorce+0.057
 sudden+0.052
 million+0.052
burn+0.051
 helicopter+0.050
Neg logit contributions
ad-0.053
ern-0.053
alia-0.048
 meanings-0.047
iles-0.045
Token.
Token ID13
Feature activation+8.15e-03

Pos logit contributions
 divorce+0.086
 million+0.078
 sudden+0.076
burn+0.074
 late+0.074
Neg logit contributions
ad-0.142
ern-0.141
alia-0.133
 meanings-0.129
 habitats-0.124
Token Let
Token ID3914
Feature activation+1.07e-01

Pos logit contributions
 divorce+0.012
 sudden+0.011
 million+0.011
burn+0.010
 helicopter+0.010
Neg logit contributions
ad-0.014
ern-0.014
 meanings-0.013
alia-0.012
 habitats-0.012
Token's
Token ID338
Feature activation+4.47e-02

Pos logit contributions
apy+0.193
 divorce+0.188
bul+0.180
 million+0.179
 sudden+0.179
Neg logit contributions
ad-0.126
et-0.119
 me-0.108
 herself-0.098
dad-0.098
Token to
Token ID284
Feature activation-1.54e-02

Pos logit contributions
ern+0.045
ad+0.045
alia+0.043
 meanings+0.041
iles+0.040
Neg logit contributions
 divorce-0.034
 late-0.029
 million-0.029
 helicopter-0.029
burn-0.029
Token their
Token ID511
Feature activation+3.33e-02

Pos logit contributions
alia+0.034
ern+0.034
ad+0.032
 meanings+0.031
ai+0.031
Neg logit contributions
 divorce-0.014
 little-0.013
 band-0.013
 baby-0.011
 better-0.011
Token mom
Token ID1995
Feature activation+3.96e-02

Pos logit contributions
 divorce+0.037
 sudden+0.032
 band+0.032
 helicopter+0.031
 million+0.029
Neg logit contributions
ern-0.070
ad-0.068
alia-0.066
 meanings-0.061
ales-0.060
Token.
Token ID13
Feature activation+1.10e-02

Pos logit contributions
 divorce+0.049
burn+0.044
 late+0.043
 million+0.043
 helicopter+0.042
Neg logit contributions
ern-0.073
ad-0.073
 meanings-0.069
alia-0.068
 habitats-0.065
Token She
Token ID1375
Feature activation-3.37e-02

Pos logit contributions
 divorce+0.014
 million+0.013
 sudden+0.013
burn+0.012
 helicopter+0.012
Neg logit contributions
ad-0.021
ern-0.020
alia-0.019
 meanings-0.019
 habitats-0.018
Token cleans
Token ID20658
Feature activation+2.26e-01

Pos logit contributions
ern+0.066
ad+0.061
alia+0.058
ales+0.057
 meanings+0.056
Neg logit contributions
 divorce-0.040
 band-0.036
 better-0.036
 helicopter-0.036
 little-0.035
Token their
Token ID511
Feature activation+3.39e-02

Pos logit contributions
 divorce+0.300
 million+0.271
 sudden+0.268
burn+0.259
 helicopter+0.254
Neg logit contributions
ad-0.411
ern-0.406
alia-0.381
 meanings-0.369
iles-0.355
Token hand
Token ID1021
Feature activation+3.99e-02

Pos logit contributions
 divorce+0.040
 band+0.038
 sudden+0.037
 million+0.035
 little+0.035
Neg logit contributions
ern-0.065
ad-0.065
alia-0.062
 meanings-0.058
iles-0.058
Token and
Token ID290
Feature activation+5.23e-02

Pos logit contributions
 divorce+0.066
 sudden+0.061
 million+0.061
 helicopter+0.057
burn+0.057
Neg logit contributions
ad-0.060
ern-0.059
alia-0.054
 meanings-0.052
et-0.051
Token gives
Token ID3607
Feature activation+2.42e-02

Pos logit contributions
 divorce+0.073
 million+0.065
 sudden+0.065
burn+0.064
 band+0.064
Neg logit contributions
ern-0.091
ad-0.090
alia-0.085
 meanings-0.082
iles-0.080
Token them
Token ID606
Feature activation-5.23e-04

Pos logit contributions
 divorce+0.036
 million+0.032
 sudden+0.032
burn+0.032
 helicopter+0.031
Neg logit contributions
ad-0.040
ern-0.040
alia-0.037
 meanings-0.036
iles-0.034
Token's
Token ID338
Feature activation+3.58e-02

Pos logit contributions
 divorce+0.062
 sudden+0.059
burn+0.054
 late+0.054
 million+0.053
Neg logit contributions
ad-0.069
 meanings-0.065
ern-0.064
alia-0.062
et-0.061
Token okay
Token ID8788
Feature activation+1.15e-01

Pos logit contributions
 divorce+0.025
 million+0.020
 sudden+0.019
burn+0.019
 late+0.018
Neg logit contributions
ad-0.088
ern-0.087
alia-0.084
 meanings-0.082
iles-0.079
Token,
Token ID11
Feature activation+6.41e-03

Pos logit contributions
 divorce+0.164
 million+0.149
 sudden+0.149
burn+0.144
 helicopter+0.142
Neg logit contributions
ad-0.196
ern-0.193
alia-0.180
 meanings-0.176
iles-0.168
Token Lily
Token ID20037
Feature activation+4.30e-02

Pos logit contributions
 divorce+0.008
 sudden+0.007
burn+0.007
 million+0.007
 helicopter+0.006
Neg logit contributions
ad-0.012
ern-0.012
 meanings-0.011
alia-0.011
 facts-0.010
Token.
Token ID13
Feature activation-1.57e-03

Pos logit contributions
 divorce+0.078
 sudden+0.073
 million+0.070
 helicopter+0.068
burn+0.067
Neg logit contributions
ad-0.057
ern-0.055
alia-0.052
 meanings-0.050
et-0.047
Token Acc
Token ID6366
Feature activation+2.25e-01

Pos logit contributions
ad+0.002
ern+0.002
 meanings+0.002
alia+0.002
 habitats+0.002
Neg logit contributions
 divorce-0.003
 sudden-0.002
 million-0.002
burn-0.002
 helicopter-0.002
Tokenidents
Token ID3231
Feature activation+3.30e-02

Pos logit contributions
 divorce+0.375
 million+0.365
 sudden+0.348
 helicopter+0.327
burn+0.323
Neg logit contributions
ern-0.313
ad-0.306
 meanings-0.300
artments-0.281
alia-0.280
Token happen
Token ID1645
Feature activation+7.03e-02

Pos logit contributions
 divorce+0.051
 million+0.046
 sudden+0.045
burn+0.045
 helicopter+0.044
Neg logit contributions
ad-0.053
ern-0.052
alia-0.049
 meanings-0.047
iles-0.045
Token.
Token ID13
Feature activation+9.60e-04

Pos logit contributions
 divorce+0.095
 million+0.086
 sudden+0.085
burn+0.084
 helicopter+0.082
Neg logit contributions
ad-0.124
ern-0.123
alia-0.115
 meanings-0.112
iles-0.108
Token Just
Token ID2329
Feature activation+3.21e-02

Pos logit contributions
 divorce+0.001
 sudden+0.001
 million+0.001
burn+0.001
 helicopter+0.001
Neg logit contributions
ad-0.001
ern-0.001
 meanings-0.001
alia-0.001
artments-0.001
Token be
Token ID307
Feature activation-1.03e-01

Pos logit contributions
 divorce+0.050
burn+0.046
 sudden+0.045
 million+0.044
 band+0.044
Neg logit contributions
ad-0.050
ern-0.049
alia-0.048
 meanings-0.045
io-0.044
Token not
Token ID407
Feature activation-2.18e-02

Pos logit contributions
 divorce+0.005
 late+0.005
 million+0.005
 sudden+0.004
burn+0.004
Neg logit contributions
ad-0.009
ern-0.009
alia-0.008
 meanings-0.008
io-0.008
Token bigger
Token ID5749
Feature activation+3.80e-02

Pos logit contributions
ad+0.038
ern+0.038
alia+0.035
 meanings+0.034
iles+0.033
Neg logit contributions
 divorce-0.029
 sudden-0.026
burn-0.026
 million-0.026
 helicopter-0.025
Token than
Token ID621
Feature activation-1.98e-01

Pos logit contributions
 divorce+0.058
 million+0.055
 late+0.055
 sudden+0.054
 accidents+0.052
Neg logit contributions
ern-0.061
ad-0.059
alia-0.055
 meanings-0.051
eda-0.050
Token a
Token ID257
Feature activation+3.74e-02

Pos logit contributions
ern+0.348
ad+0.343
alia+0.326
 meanings+0.309
io+0.295
Neg logit contributions
 divorce-0.275
 million-0.246
 sudden-0.242
 helicopter-0.237
burn-0.235
Token fing
Token ID8038
Feature activation-2.48e-02

Pos logit contributions
 divorce+0.046
 sudden+0.041
 million+0.041
burn+0.040
 late+0.039
Neg logit contributions
ad-0.069
ern-0.069
alia-0.065
 meanings-0.063
iles-0.061
Tokenern
Token ID1142
Feature activation-2.93e-01

Pos logit contributions
 facts+0.023
 meanings+0.022
 habitats+0.020
 Lucas+0.020
alia+0.020
Neg logit contributions
 divorce-0.048
burn-0.043
 unst-0.043
 boo-0.042
Acc-0.042
Tokenail
Token ID603
Feature activation-1.28e-02

Pos logit contributions
 meanings+0.444
ad+0.436
ern+0.423
 facts+0.392
alia+0.386
Neg logit contributions
 divorce-0.435
 sudden-0.397
burn-0.396
 million-0.396
 late-0.393
Token.
Token ID13
Feature activation-2.19e-02

Pos logit contributions
ern+0.023
ad+0.023
alia+0.022
 meanings+0.021
iles+0.020
Neg logit contributions
 divorce-0.016
 million-0.015
burn-0.014
 helicopter-0.014
 sudden-0.014
Token They
Token ID1119
Feature activation+3.90e-03

Pos logit contributions
ad+0.043
ern+0.043
alia+0.041
 meanings+0.038
iles+0.038
Neg logit contributions
 divorce-0.025
 sudden-0.022
 million-0.022
 late-0.022
 helicopter-0.021
Token also
Token ID635
Feature activation-4.12e-02

Pos logit contributions
 divorce+0.004
 sudden+0.004
 helicopter+0.004
 million+0.004
 sank+0.004
Neg logit contributions
ern-0.008
ad-0.008
alia-0.007
dad-0.007
io-0.007
Token found
Token ID1043
Feature activation-3.24e-02

Pos logit contributions
ern+0.088
ad+0.079
alia+0.077
iles+0.076
 meanings+0.076
Neg logit contributions
 destroyed-0.039
 divorce-0.038
 blew-0.037
 sank-0.036
 million-0.036
Token the
Token ID262
Feature activation+1.91e-03

Pos logit contributions
 divorce+0.019
 million+0.017
 sudden+0.017
burn+0.016
 helicopter+0.016
Neg logit contributions
ad-0.035
ern-0.034
alia-0.032
 meanings-0.032
iles-0.030
Token dove
Token ID38862
Feature activation-4.96e-03

Pos logit contributions
 divorce+0.003
 million+0.003
 sudden+0.002
burn+0.002
 late+0.002
Neg logit contributions
ad-0.003
ern-0.003
alia-0.003
 meanings-0.003
et-0.003
Token with
Token ID351
Feature activation-5.58e-02

Pos logit contributions
ad+0.008
ern+0.008
et+0.007
alia+0.007
ales+0.006
Neg logit contributions
 divorce-0.008
 sudden-0.007
 million-0.007
 helicopter-0.006
apy-0.006
Token her
Token ID607
Feature activation-4.53e-02

Pos logit contributions
ad+0.083
ern+0.076
 meanings+0.070
alia+0.069
et+0.067
Neg logit contributions
 divorce-0.097
burn-0.085
 million-0.084
 sudden-0.084
 helicopter-0.083
Token fing
Token ID8038
Feature activation-1.99e-02

Pos logit contributions
ad+0.073
ern+0.072
 meanings+0.065
alia+0.065
et+0.062
Neg logit contributions
 divorce-0.070
 million-0.063
 sudden-0.060
 late-0.060
burn-0.060
Tokenern
Token ID1142
Feature activation-2.88e-01

Pos logit contributions
 facts+0.012
 Lucas+0.011
 meanings+0.011
 poses+0.010
 habitats+0.010
Neg logit contributions
 divorce-0.044
burn-0.043
 band-0.041
 unst-0.040
Acc-0.040
Tokenails
Token ID1768
Feature activation-4.38e-02

Pos logit contributions
ad+0.475
ern+0.462
 meanings+0.446
alia+0.439
 facts+0.418
Neg logit contributions
 divorce-0.404
 million-0.372
 sudden-0.372
burn-0.368
 late-0.351
Token.
Token ID13
Feature activation-1.94e-02

Pos logit contributions
ern+0.075
ad+0.075
alia+0.070
 meanings+0.068
iles+0.066
Neg logit contributions
 divorce-0.061
 million-0.056
 sudden-0.055
burn-0.054
 helicopter-0.053
Token The
Token ID383
Feature activation+1.43e-02

Pos logit contributions
ern+0.023
ad+0.022
 meanings+0.020
alia+0.020
 herself+0.018
Neg logit contributions
 divorce-0.037
burn-0.036
 million-0.034
 sudden-0.033
 helicopter-0.033
Token dove
Token ID38862
Feature activation-4.18e-03

Pos logit contributions
 divorce+0.024
 million+0.023
 late+0.021
burn+0.020
 sudden+0.020
Neg logit contributions
ad-0.021
ern-0.021
alia-0.019
 meanings-0.018
et-0.018
Token was
Token ID373
Feature activation+6.29e-03

Pos logit contributions
ad+0.007
et+0.006
ern+0.006
 meanings+0.005
io+0.005
Neg logit contributions
 divorce-0.007
 late-0.006
burn-0.006
 million-0.005
 sudden-0.005
Token M
Token ID337
Feature activation+1.13e-02

Pos logit contributions
 divorce+0.010
 million+0.009
 sudden+0.009
 helicopter+0.008
burn+0.008
Neg logit contributions
ern-0.018
ad-0.018
alia-0.017
 meanings-0.017
 habitats-0.016
Tokenummy
Token ID13513
Feature activation-4.76e-02

Pos logit contributions
 divorce+0.017
 sudden+0.015
 million+0.015
burn+0.015
 late+0.015
Neg logit contributions
ad-0.018
ern-0.018
alia-0.016
 meanings-0.016
 habitats-0.015
Token used
Token ID973
Feature activation+2.56e-02

Pos logit contributions
ern+0.088
ad+0.087
 meanings+0.081
alia+0.081
 habitats+0.078
Neg logit contributions
 divorce-0.060
 million-0.054
 sudden-0.054
burn-0.052
 helicopter-0.051
Token her
Token ID607
Feature activation-3.47e-02

Pos logit contributions
 divorce+0.029
 sudden+0.026
 million+0.025
 band+0.025
 late+0.025
Neg logit contributions
ern-0.050
ad-0.049
alia-0.048
iles-0.046
 meanings-0.046
Token fing
Token ID8038
Feature activation-2.34e-02

Pos logit contributions
ad+0.059
ern+0.058
alia+0.055
 meanings+0.053
iles+0.051
Neg logit contributions
 divorce-0.050
 million-0.045
 sudden-0.045
burn-0.044
 helicopter-0.043
Tokenern
Token ID1142
Feature activation-2.86e-01

Pos logit contributions
 habitats+0.015
 facts+0.015
 meanings+0.014
 Lucas+0.013
 cultures+0.012
Neg logit contributions
 divorce-0.049
 unst-0.047
Acc-0.047
burn-0.046
 late-0.046
Tokenails
Token ID1768
Feature activation-4.15e-02

Pos logit contributions
ad+0.519
ern+0.516
 meanings+0.511
 habitats+0.484
alia+0.484
Neg logit contributions
 divorce-0.334
 sudden-0.310
 million-0.299
 band-0.294
burn-0.289
Token to
Token ID284
Feature activation-2.21e-02

Pos logit contributions
ern+0.083
ad+0.081
alia+0.078
 meanings+0.076
 habitats+0.074
Neg logit contributions
 divorce-0.046
 million-0.041
 sudden-0.041
burn-0.039
 helicopter-0.039
Token gently
Token ID15165
Feature activation+4.87e-02

Pos logit contributions
ern+0.040
alia+0.040
ad+0.038
iles+0.038
 habitats+0.038
Neg logit contributions
 divorce-0.026
 band-0.026
 sudden-0.025
 little-0.024
burn-0.023
Token pull
Token ID2834
Feature activation+2.32e-03

Pos logit contributions
 divorce+0.056
 million+0.051
 sudden+0.050
burn+0.050
 band+0.050
Neg logit contributions
ad-0.093
ern-0.092
alia-0.089
iles-0.088
 meanings-0.087
Token the
Token ID262
Feature activation+1.36e-02

Pos logit contributions
 divorce+0.003
 sudden+0.002
 million+0.002
 late+0.002
 helicopter+0.002
Neg logit contributions
ern-0.004
ad-0.004
alia-0.004
 meanings-0.004
 habitats-0.004
Token called
Token ID1444
Feature activation-4.86e-02

Pos logit contributions
 divorce+0.058
 million+0.052
 sudden+0.051
burn+0.050
 helicopter+0.049
Neg logit contributions
ad-0.079
ern-0.078
alia-0.073
 meanings-0.070
iles-0.068
Token Lucy
Token ID22162
Feature activation-5.19e-04

Pos logit contributions
ad+0.080
ern+0.078
alia+0.074
 meanings+0.073
iles+0.069
Neg logit contributions
 divorce-0.072
 million-0.064
burn-0.063
 sudden-0.063
 helicopter-0.061
Token.
Token ID13
Feature activation-2.61e-02

Pos logit contributions
ad+0.001
ern+0.001
alia+0.001
 meanings+0.001
iles+0.001
Neg logit contributions
 divorce-0.001
 million-0.001
 sudden-0.001
burn-0.001
 helicopter-0.001
Token She
Token ID1375
Feature activation-8.22e-02

Pos logit contributions
ad+0.042
ern+0.040
alia+0.040
iles+0.038
 meanings+0.037
Neg logit contributions
 divorce-0.039
 million-0.036
 late-0.035
 helicopter-0.035
 sudden-0.035
Token was
Token ID373
Feature activation-6.86e-04

Pos logit contributions
ad+0.138
ern+0.138
alia+0.129
 meanings+0.125
iles+0.120
Neg logit contributions
 divorce-0.118
 sudden-0.108
 million-0.108
burn-0.104
 helicopter-0.103
Token seven
Token ID3598
Feature activation-2.64e-01

Pos logit contributions
ad+0.001
ern+0.001
alia+0.001
 meanings+0.001
iles+0.001
Neg logit contributions
 divorce-0.001
 million-0.001
 sudden-0.001
burn-0.001
 helicopter-0.001
Token years
Token ID812
Feature activation-1.11e-01

Pos logit contributions
ern+0.444
ad+0.414
artments+0.405
alia+0.402
 Lucas+0.391
Neg logit contributions
 sudden-0.373
 little-0.369
 divorce-0.348
 band-0.345
 helicopter-0.341
Token old
Token ID1468
Feature activation-9.90e-02

Pos logit contributions
ad+0.152
ern+0.150
alia+0.138
 meanings+0.132
iles+0.126
Neg logit contributions
 divorce-0.196
 million-0.181
 sudden-0.181
burn-0.180
 helicopter-0.171
Token and
Token ID290
Feature activation+1.89e-02

Pos logit contributions
ad+0.218
ern+0.215
alia+0.202
eda+0.200
artments+0.200
Neg logit contributions
 little-0.087
 helicopter-0.083
 divorce-0.081
 baby-0.080
 sudden-0.080
Token lived
Token ID5615
Feature activation-4.91e-02

Pos logit contributions
apy+0.018
 little+0.018
 they+0.017
 much+0.016
 million+0.015
Neg logit contributions
ales-0.040
ad-0.040
ern-0.040
iles-0.039
 habitats-0.039
Token in
Token ID287
Feature activation-2.81e-02

Pos logit contributions
ad+0.086
ern+0.086
alia+0.080
 meanings+0.078
iles+0.075
Neg logit contributions
 divorce-0.067
 million-0.060
 sudden-0.060
burn-0.059
 helicopter-0.057
Token grown
Token ID7334
Feature activation+3.82e-02

Pos logit contributions
ad+0.204
ern+0.202
alia+0.190
 meanings+0.184
iles+0.178
Neg logit contributions
 divorce-0.132
 million-0.117
 sudden-0.116
burn-0.113
 helicopter-0.111
Token up
Token ID510
Feature activation+4.32e-02

Pos logit contributions
 divorce+0.055
 sudden+0.049
 million+0.048
apy+0.047
burn+0.047
Neg logit contributions
ad-0.065
ern-0.064
alia-0.062
 meanings-0.057
et-0.057
Token until
Token ID1566
Feature activation-2.07e-02

Pos logit contributions
 divorce+0.052
 million+0.047
 sudden+0.046
 late+0.045
 helicopter+0.044
Neg logit contributions
ern-0.083
ad-0.083
alia-0.079
 meanings-0.076
io-0.073
Token you
Token ID345
Feature activation-3.11e-02

Pos logit contributions
ad+0.040
ern+0.039
alia+0.037
 meanings+0.036
iles+0.035
Neg logit contributions
 divorce-0.025
 sudden-0.022
 million-0.022
burn-0.021
 helicopter-0.021
Token are
Token ID389
Feature activation-7.26e-02

Pos logit contributions
ern+0.052
ad+0.051
alia+0.049
 meanings+0.047
 facts+0.045
Neg logit contributions
 divorce-0.044
 million-0.040
burn-0.039
 sudden-0.039
 helicopter-0.038
Token seven
Token ID3598
Feature activation-2.63e-01

Pos logit contributions
ad+0.133
ern+0.132
alia+0.126
 meanings+0.122
 habitats+0.118
Neg logit contributions
 divorce-0.094
 million-0.087
 sudden-0.083
 late-0.082
burn-0.080
Token,
Token ID11
Feature activation-5.97e-04

Pos logit contributions
ern+0.449
ad+0.434
alia+0.404
 meanings+0.401
artments+0.394
Neg logit contributions
 divorce-0.353
 sudden-0.346
 million-0.335
burn-0.320
 band-0.320
Token like
Token ID588
Feature activation-1.75e-01

Pos logit contributions
ad+0.001
ern+0.001
alia+0.001
 meanings+0.001
iles+0.001
Neg logit contributions
 divorce-0.001
 million-0.001
 sudden-0.001
burn-0.001
 helicopter-0.001
Token me
Token ID502
Feature activation-1.28e-01

Pos logit contributions
ern+0.331
ad+0.327
alia+0.314
 meanings+0.293
io+0.290
Neg logit contributions
 divorce-0.217
 million-0.197
 sudden-0.193
 helicopter-0.190
 late-0.184
Token."
Token ID526
Feature activation+2.32e-02

Pos logit contributions
ad+0.246
ern+0.245
alia+0.234
 meanings+0.231
eda+0.222
Neg logit contributions
 divorce-0.148
burn-0.138
 million-0.135
 late-0.135
 helicopter-0.128
Token But
Token ID887
Feature activation-4.04e-02

Pos logit contributions
 divorce+0.027
 million+0.024
 sudden+0.024
burn+0.023
 helicopter+0.022
Neg logit contributions
ad-0.045
ern-0.045
alia-0.043
 meanings-0.042
iles-0.040
Token he
Token ID339
Feature activation+1.05e-02

Pos logit contributions
 divorce+0.029
 sudden+0.027
 million+0.026
 late+0.026
 helicopter+0.025
Neg logit contributions
ern-0.040
ad-0.040
alia-0.037
 meanings-0.037
 habitats-0.035
Token discovered
Token ID5071
Feature activation-4.21e-02

Pos logit contributions
 sudden+0.013
 divorce+0.012
 million+0.011
 little+0.011
 sank+0.011
Neg logit contributions
ern-0.021
ad-0.020
alia-0.019
 meanings-0.019
 facts-0.019
Token that
Token ID326
Feature activation-5.94e-02

Pos logit contributions
ern+0.087
alia+0.085
ad+0.083
 meanings+0.080
 habitats+0.080
Neg logit contributions
 divorce-0.044
 helicopter-0.041
 sudden-0.040
 million-0.039
 little-0.037
Token it
Token ID340
Feature activation+7.23e-02

Pos logit contributions
ern+0.125
ad+0.124
alia+0.119
 meanings+0.115
iles+0.111
Neg logit contributions
 divorce-0.063
 million-0.055
 sudden-0.053
 helicopter-0.053
burn-0.051
Token was
Token ID373
Feature activation-2.60e-04

Pos logit contributions
 million+0.050
 divorce+0.048
 sudden+0.044
 little+0.042
 destroyed+0.042
Neg logit contributions
ern-0.174
 habitats-0.167
ad-0.166
 meanings-0.164
alia-0.162
Token seven
Token ID3598
Feature activation-2.61e-01

Pos logit contributions
ern+0.001
ad+0.001
alia+0.001
 meanings+0.001
ales+0.001
Neg logit contributions
 late-0.000
 divorce-0.000
 little-0.000
 million-0.000
 worse-0.000
Token little
Token ID1310
Feature activation+1.49e-01

Pos logit contributions
ern+0.429
alia+0.400
ad+0.396
artments+0.393
 Lucas+0.386
Neg logit contributions
 sudden-0.372
 little-0.369
 million-0.326
 divorce-0.325
 band-0.323
Token birds
Token ID10087
Feature activation-4.72e-02

Pos logit contributions
 divorce+0.146
 million+0.140
 sudden+0.140
 helicopter+0.137
 little+0.130
Neg logit contributions
ern-0.313
ad-0.307
alia-0.288
 meanings-0.281
eda-0.273
Token sc
Token ID629
Feature activation-4.00e-02

Pos logit contributions
ad+0.075
ern+0.074
alia+0.069
 meanings+0.066
iles+0.064
Neg logit contributions
 divorce-0.073
 million-0.066
 sudden-0.066
burn-0.064
 helicopter-0.063
Tokenatt
Token ID1078
Feature activation+1.43e-02

Pos logit contributions
ad+0.052
ern+0.052
 meanings+0.048
alia+0.048
 poses+0.045
Neg logit contributions
 divorce-0.069
burn-0.062
 million-0.061
 unst-0.060
 sudden-0.060
Tokentering
Token ID20212
Feature activation+1.38e-02

Pos logit contributions
 divorce+0.020
burn+0.019
 million+0.019
 sudden+0.019
broken+0.018
Neg logit contributions
ern-0.023
ad-0.022
alia-0.021
 meanings-0.021
artments-0.020
Token and
Token ID290
Feature activation+2.09e-02

Pos logit contributions
ern+0.163
ad+0.155
 meanings+0.153
alia+0.150
artments+0.145
Neg logit contributions
 divorce-0.057
 late-0.052
 million-0.050
burn-0.049
 little-0.049
Token Mary
Token ID5335
Feature activation-8.33e-02

Pos logit contributions
 divorce+0.024
 million+0.021
 late+0.020
 sudden+0.020
burn+0.020
Neg logit contributions
ern-0.041
ad-0.041
 meanings-0.038
alia-0.038
 habitats-0.037
Token.
Token ID13
Feature activation-2.12e-02

Pos logit contributions
ern+0.144
ad+0.142
alia+0.135
 meanings+0.131
 habitats+0.125
Neg logit contributions
 divorce-0.114
 sudden-0.106
 million-0.106
 helicopter-0.102
burn-0.102
Token Tom
Token ID4186
Feature activation-7.24e-02

Pos logit contributions
ad+0.046
ern+0.045
alia+0.044
eda+0.043
ani+0.043
Neg logit contributions
 late-0.016
 divorce-0.016
 million-0.015
 together-0.015
 they-0.014
Token was
Token ID373
Feature activation-1.07e-03

Pos logit contributions
ern+0.173
ad+0.160
 meanings+0.157
alia+0.155
wered+0.155
Neg logit contributions
 late-0.052
 they-0.052
 together-0.051
 divorce-0.050
 little-0.049
Token seven
Token ID3598
Feature activation-2.61e-01

Pos logit contributions
ad+0.002
ern+0.002
alia+0.002
 meanings+0.002
iles+0.002
Neg logit contributions
 divorce-0.001
 million-0.001
 sudden-0.001
burn-0.001
 helicopter-0.001
Token,
Token ID11
Feature activation-9.26e-03

Pos logit contributions
ern+0.549
ad+0.507
artments+0.501
alia+0.497
 Lucas+0.482
Neg logit contributions
 sudden-0.265
 little-0.258
 divorce-0.234
 million-0.233
 band-0.230
Token and
Token ID290
Feature activation+2.38e-02

Pos logit contributions
ad+0.016
ern+0.016
alia+0.015
 meanings+0.014
iles+0.014
Neg logit contributions
 divorce-0.013
 million-0.012
 sudden-0.012
burn-0.011
 helicopter-0.011
Token Mary
Token ID5335
Feature activation-8.21e-02

Pos logit contributions
 million+0.035
 they+0.034
 divorce+0.034
 late+0.034
 sudden+0.033
Neg logit contributions
ern-0.037
ad-0.036
 habitats-0.033
alia-0.032
 meanings-0.032
Token was
Token ID373
Feature activation-3.11e-04

Pos logit contributions
ern+0.184
alia+0.173
ad+0.173
 habitats+0.172
 meanings+0.171
Neg logit contributions
 together-0.068
 they-0.065
 million-0.062
 we-0.061
 divorce-0.059
Token three
Token ID1115
Feature activation-8.93e-02

Pos logit contributions
ad+0.001
ern+0.001
alia+0.001
 meanings+0.001
iles+0.001
Neg logit contributions
 divorce-0.000
 million-0.000
 sudden-0.000
burn-0.000
 helicopter-0.000
Token pirate
Token ID25868
Feature activation-4.27e-02

Pos logit contributions
ern+0.192
ad+0.192
alia+0.180
 meanings+0.174
iles+0.168
Neg logit contributions
 divorce-0.133
 million-0.120
 helicopter-0.119
 sudden-0.118
burn-0.114
Token.
Token ID13
Feature activation-2.22e-02

Pos logit contributions
ern+0.083
ad+0.081
alia+0.077
 meanings+0.076
iles+0.071
Neg logit contributions
 divorce-0.051
 helicopter-0.046
 sudden-0.046
 million-0.045
burn-0.045
Token He
Token ID679
Feature activation-7.25e-02

Pos logit contributions
ern+0.039
ad+0.037
alia+0.037
 meanings+0.035
iles+0.034
Neg logit contributions
 divorce-0.029
 late-0.028
 million-0.027
 helicopter-0.026
 little-0.025
Token sailed
Token ID33835
Feature activation+9.79e-02

Pos logit contributions
ern+0.130
ad+0.125
alia+0.121
 meanings+0.113
iles+0.110
Neg logit contributions
 divorce-0.096
 sudden-0.092
 million-0.088
 helicopter-0.087
 late-0.084
Token the
Token ID262
Feature activation-8.45e-03

Pos logit contributions
 late+0.103
 together+0.095
 little+0.094
 divorce+0.090
 sudden+0.090
Neg logit contributions
ern-0.218
ad-0.196
alia-0.196
 meanings-0.187
iley-0.182
Token seven
Token ID3598
Feature activation-2.61e-01

Pos logit contributions
ad+0.013
ern+0.012
alia+0.012
iles+0.011
 habitats+0.011
Neg logit contributions
 divorce-0.013
apy-0.012
burn-0.012
 million-0.012
 sudden-0.012
Token seas
Token ID21547
Feature activation-1.05e-01

Pos logit contributions
ern+0.420
ad+0.414
alia+0.388
 meanings+0.374
artments+0.351
Neg logit contributions
 divorce-0.389
 sudden-0.366
 million-0.360
 helicopter-0.351
burn-0.344
Token,
Token ID11
Feature activation-1.41e-02

Pos logit contributions
ern+0.227
ad+0.221
alia+0.216
 meanings+0.212
artments+0.199
Neg logit contributions
 divorce-0.093
 million-0.084
 sudden-0.081
 late-0.079
 helicopter-0.078
Token looking
Token ID2045
Feature activation-4.36e-02

Pos logit contributions
ad+0.025
ern+0.025
alia+0.024
 meanings+0.023
iles+0.022
Neg logit contributions
 divorce-0.019
 million-0.017
 sudden-0.017
burn-0.017
 helicopter-0.016
Token for
Token ID329
Feature activation-1.95e-01

Pos logit contributions
ad+0.049
ern+0.048
alia+0.043
 meanings+0.041
iles+0.038
Neg logit contributions
 divorce-0.088
 million-0.082
 sudden-0.082
burn-0.081
 helicopter-0.079
Token treasure
Token ID14068
Feature activation-8.15e-02

Pos logit contributions
ern+0.398
alia+0.365
ad+0.363
 meanings+0.345
eka+0.330
Neg logit contributions
 divorce-0.222
 sudden-0.215
 little-0.213
 band-0.199
 million-0.197
Token named
Token ID3706
Feature activation+3.89e-02

Pos logit contributions
 divorce+0.061
 million+0.055
 sudden+0.054
burn+0.052
 helicopter+0.051
Neg logit contributions
ad-0.087
ern-0.086
alia-0.081
 meanings-0.078
iles-0.075
Token Beth
Token ID19223
Feature activation+1.41e-02

Pos logit contributions
 divorce+0.046
 sudden+0.041
 million+0.041
 helicopter+0.040
burn+0.040
Neg logit contributions
ern-0.078
ad-0.075
alia-0.071
 meanings-0.069
artments-0.066
Token.
Token ID13
Feature activation-2.67e-02

Pos logit contributions
 divorce+0.023
 million+0.021
 sudden+0.021
burn+0.020
 helicopter+0.020
Neg logit contributions
ad-0.021
ern-0.021
alia-0.020
 meanings-0.019
et-0.018
Token She
Token ID1375
Feature activation-7.64e-02

Pos logit contributions
ad+0.039
ern+0.039
alia+0.036
 meanings+0.034
iles+0.033
Neg logit contributions
 divorce-0.044
 million-0.041
 sudden-0.041
burn-0.040
 helicopter-0.039
Token was
Token ID373
Feature activation-3.10e-03

Pos logit contributions
ern+0.135
ad+0.135
alia+0.127
 meanings+0.123
iles+0.120
Neg logit contributions
 divorce-0.102
 sudden-0.094
 million-0.093
 helicopter-0.089
burn-0.089
Token seven
Token ID3598
Feature activation-2.59e-01

Pos logit contributions
ad+0.005
ern+0.005
alia+0.005
 meanings+0.005
iles+0.004
Neg logit contributions
 divorce-0.004
 million-0.004
 sudden-0.004
burn-0.004
 helicopter-0.004
Token years
Token ID812
Feature activation-1.05e-01

Pos logit contributions
ern+0.411
ad+0.387
alia+0.373
artments+0.369
 Lucas+0.359
Neg logit contributions
 sudden-0.390
 little-0.379
 divorce-0.368
 million-0.360
 band-0.358
Token old
Token ID1468
Feature activation-9.69e-02

Pos logit contributions
ad+0.144
ern+0.141
alia+0.130
 meanings+0.123
iles+0.118
Neg logit contributions
 divorce-0.185
 million-0.170
 sudden-0.170
burn-0.169
 helicopter-0.161
Token.
Token ID13
Feature activation-2.68e-02

Pos logit contributions
ad+0.212
ern+0.208
alia+0.195
iles+0.194
eda+0.193
Neg logit contributions
 little-0.084
 divorce-0.083
 helicopter-0.082
 sudden-0.082
 band-0.081
Token She
Token ID1375
Feature activation-7.47e-02

Pos logit contributions
ad+0.043
ern+0.042
alia+0.041
iles+0.039
 meanings+0.039
Neg logit contributions
 divorce-0.039
 sudden-0.037
 million-0.037
burn-0.036
 helicopter-0.035
Token loved
Token ID6151
Feature activation+5.67e-04

Pos logit contributions
ales+0.165
ern+0.162
ad+0.158
iles+0.158
alia+0.154
Neg logit contributions
 sudden-0.061
 little-0.061
 accidentally-0.059
 million-0.058
 band-0.057
Token said
Token ID531
Feature activation-5.53e-02

Pos logit contributions
 sudden+0.063
 divorce+0.063
 million+0.059
 band+0.056
 sank+0.056
Neg logit contributions
ern-0.122
ad-0.120
iles-0.114
alia-0.113
 meanings-0.112
Token,
Token ID11
Feature activation-3.37e-03

Pos logit contributions
ad+0.103
ern+0.102
alia+0.096
 meanings+0.093
iles+0.090
Neg logit contributions
 divorce-0.070
 million-0.063
 sudden-0.062
burn-0.061
 helicopter-0.059
Token "
Token ID366
Feature activation+8.20e-03

Pos logit contributions
ad+0.006
ern+0.006
alia+0.006
 meanings+0.006
iles+0.005
Neg logit contributions
 divorce-0.004
 million-0.004
 sudden-0.004
burn-0.004
 helicopter-0.004
TokenI
Token ID40
Feature activation+7.77e-03

Pos logit contributions
 divorce+0.009
 sudden+0.008
 million+0.008
burn+0.008
 late+0.007
Neg logit contributions
ern-0.017
ad-0.016
alia-0.016
 meanings-0.015
iles-0.015
Token picked
Token ID6497
Feature activation-7.26e-03

Pos logit contributions
 divorce+0.010
 million+0.009
 sudden+0.009
burn+0.009
 late+0.008
Neg logit contributions
ad-0.014
ern-0.014
alia-0.013
 meanings-0.013
iles-0.013
Token seven
Token ID3598
Feature activation-2.58e-01

Pos logit contributions
ern+0.015
ad+0.015
alia+0.014
 meanings+0.014
 habitats+0.014
Neg logit contributions
 divorce-0.008
 million-0.007
 helicopter-0.007
 sudden-0.006
burn-0.006
Token flowers
Token ID12734
Feature activation-8.08e-02

Pos logit contributions
ern+0.436
ad+0.436
alia+0.418
 Lucas+0.392
 meanings+0.391
Neg logit contributions
 sudden-0.336
 million-0.328
 divorce-0.326
 little-0.325
 helicopter-0.320
Token and
Token ID290
Feature activation+3.37e-02

Pos logit contributions
 meanings+0.172
ern+0.169
ad+0.167
 habitats+0.161
alia+0.161
Neg logit contributions
 million-0.074
 we-0.072
 together-0.071
 divorce-0.071
burn-0.069
Token you
Token ID345
Feature activation-3.01e-02

Pos logit contributions
 sudden+0.042
 little+0.042
 divorce+0.042
 million+0.041
 we+0.041
Neg logit contributions
ern-0.059
ad-0.059
alia-0.054
iles-0.052
iddle-0.051
Token?"
Token ID1701
Feature activation-3.52e-02

Pos logit contributions
ern+0.058
ad+0.057
alia+0.055
 meanings+0.052
 facts+0.051
Neg logit contributions
 divorce-0.032
 million-0.032
 little-0.029
 band-0.029
 destroyed-0.029
Token The
Token ID383
Feature activation+1.58e-02

Pos logit contributions
ad+0.069
ern+0.068
alia+0.065
 meanings+0.064
iles+0.063
Neg logit contributions
 divorce-0.040
 sudden-0.036
 million-0.035
 helicopter-0.033
burn-0.033
Token one
Token ID530
Feature activation-6.90e-02

Pos logit contributions
ern+0.034
ad+0.034
alia+0.033
iles+0.032
 meanings+0.032
Neg logit contributions
 divorce-0.014
 million-0.013
 together-0.013
 helicopter-0.012
 sudden-0.012
Token by
Token ID416
Feature activation-3.15e-02

Pos logit contributions
ern+0.134
ad+0.132
alia+0.128
 meanings+0.126
 habitats+0.124
Neg logit contributions
 divorce-0.076
 million-0.072
 helicopter-0.071
 sudden-0.071
 little-0.069
Token one
Token ID530
Feature activation-6.91e-02

Pos logit contributions
 meanings+0.052
alia+0.050
iles+0.049
 habitats+0.048
 facts+0.047
Neg logit contributions
 sudden-0.042
 divorce-0.042
 little-0.042
 million-0.041
 late-0.040
Token and
Token ID290
Feature activation+2.73e-02

Pos logit contributions
ern+0.139
ad+0.134
alia+0.130
 meanings+0.128
 habitats+0.127
Neg logit contributions
 divorce-0.071
 million-0.071
 sudden-0.070
 little-0.069
 helicopter-0.067
Token counted
Token ID14789
Feature activation-1.18e-01

Pos logit contributions
 divorce+0.034
 sudden+0.031
 million+0.031
 late+0.029
 helicopter+0.029
Neg logit contributions
ad-0.051
ern-0.051
alia-0.048
 meanings-0.047
iles-0.046
Token seven
Token ID3598
Feature activation-2.57e-01

Pos logit contributions
ern+0.264
alia+0.254
ad+0.252
 meanings+0.249
 habitats+0.244
Neg logit contributions
 together-0.098
 helicopter-0.089
 they-0.086
 little-0.086
 sudden-0.084
Token in
Token ID287
Feature activation-1.86e-02

Pos logit contributions
ern+0.503
ad+0.482
alia+0.481
 habitats+0.454
ori+0.451
Neg logit contributions
 sudden-0.297
 little-0.275
 million-0.269
 helicopter-0.251
 band-0.244
Token all
Token ID477
Feature activation-7.21e-02

Pos logit contributions
ern+0.037
alia+0.035
iles+0.034
ales+0.034
 meanings+0.033
Neg logit contributions
 sudden-0.021
 million-0.021
 little-0.020
 late-0.019
 divorce-0.018
Token.
Token ID13
Feature activation-1.73e-02

Pos logit contributions
ern+0.153
ad+0.146
 meanings+0.144
alia+0.143
 habitats+0.140
Neg logit contributions
 million-0.068
 divorce-0.068
 sudden-0.063
 helicopter-0.062
 late-0.061
Token To
Token ID1675
Feature activation-1.26e-02

Pos logit contributions
ad+0.028
ern+0.027
alia+0.026
 meanings+0.025
iles+0.024
Neg logit contributions
 divorce-0.026
 million-0.024
 sudden-0.024
 late-0.023
 helicopter-0.023
Token his
Token ID465
Feature activation-3.03e-02

Pos logit contributions
ern+0.024
 meanings+0.023
alia+0.023
 habitats+0.023
eka+0.022
Neg logit contributions
 divorce-0.014
 million-0.014
apy-0.013
 little-0.013
 they-0.013
Token Joe
Token ID5689
Feature activation-8.08e-02

Pos logit contributions
ad+0.061
ern+0.061
alia+0.056
 meanings+0.054
iles+0.051
Neg logit contributions
 divorce-0.069
 million-0.064
 sudden-0.063
burn-0.062
 helicopter-0.061
Token.
Token ID13
Feature activation-2.60e-02

Pos logit contributions
ern+0.152
ad+0.147
 meanings+0.143
alia+0.141
artments+0.133
Neg logit contributions
 divorce-0.091
 million-0.086
burn-0.085
 little-0.084
 helicopter-0.084
Token He
Token ID679
Feature activation-6.98e-02

Pos logit contributions
ern+0.047
alia+0.043
iley+0.042
ani+0.040
icity+0.039
Neg logit contributions
 late-0.033
 little-0.030
 together-0.030
 sudden-0.029
 divorce-0.029
Token sailed
Token ID33835
Feature activation+1.01e-01

Pos logit contributions
ern+0.133
ad+0.126
alia+0.122
 meanings+0.117
ales+0.112
Neg logit contributions
 divorce-0.083
 sudden-0.079
 million-0.077
 helicopter-0.076
 band-0.075
Token the
Token ID262
Feature activation-5.47e-03

Pos logit contributions
 late+0.104
 little+0.095
 together+0.094
 sudden+0.094
 divorce+0.091
Neg logit contributions
ern-0.228
ad-0.206
alia-0.200
 meanings-0.198
iley-0.187
Token seven
Token ID3598
Feature activation-2.56e-01

Pos logit contributions
ad+0.008
ern+0.007
alia+0.007
iles+0.007
 habitats+0.007
Neg logit contributions
 divorce-0.009
apy-0.008
burn-0.008
 million-0.008
 sudden-0.008
Token seas
Token ID21547
Feature activation-1.01e-01

Pos logit contributions
ad+0.391
ern+0.388
alia+0.360
 meanings+0.347
iles+0.330
Neg logit contributions
 divorce-0.408
 million-0.375
 sudden-0.373
burn-0.365
 helicopter-0.359
Token looking
Token ID2045
Feature activation-4.25e-02

Pos logit contributions
ern+0.213
ad+0.208
alia+0.202
 meanings+0.198
 habitats+0.186
Neg logit contributions
 divorce-0.097
 million-0.088
 sudden-0.085
 helicopter-0.083
 late-0.081
Token for
Token ID329
Feature activation-1.95e-01

Pos logit contributions
ad+0.059
ern+0.056
alia+0.054
 meanings+0.050
iles+0.050
Neg logit contributions
 divorce-0.074
 million-0.069
 sudden-0.068
burn-0.068
 helicopter-0.064
Token adventure
Token ID8855
Feature activation-8.41e-02

Pos logit contributions
ern+0.408
alia+0.367
ad+0.361
 meanings+0.355
eka+0.339
Neg logit contributions
 little-0.213
 sudden-0.208
 divorce-0.207
 band-0.189
 million-0.184
Token.
Token ID13
Feature activation-2.61e-02

Pos logit contributions
ern+0.202
 meanings+0.186
ad+0.186
alia+0.186
wered+0.181
Neg logit contributions
 together-0.072
 late-0.071
 while-0.061
 they-0.060
 little-0.054
Token,
Token ID11
Feature activation-3.21e-03

Pos logit contributions
ern+0.094
ad+0.089
 meanings+0.089
alia+0.087
 habitats+0.084
Neg logit contributions
 divorce-0.042
 helicopter-0.042
 sudden-0.042
 million-0.041
burn-0.040
Token four
Token ID1440
Feature activation-8.25e-02

Pos logit contributions
ern+0.006
ad+0.005
alia+0.005
 meanings+0.005
artments+0.005
Neg logit contributions
 sudden-0.004
 divorce-0.004
 million-0.004
 helicopter-0.004
 little-0.004
Token trees
Token ID7150
Feature activation-9.79e-02

Pos logit contributions
ern+0.163
ad+0.157
alia+0.153
artments+0.146
 meanings+0.145
Neg logit contributions
 sudden-0.101
 million-0.094
 little-0.089
 helicopter-0.085
 band-0.078
Token,
Token ID11
Feature activation-2.36e-03

Pos logit contributions
ern+0.227
 meanings+0.217
alia+0.214
ad+0.212
ori+0.209
Neg logit contributions
 helicopter-0.067
 together-0.067
 million-0.065
 sudden-0.062
 they-0.062
Token and
Token ID290
Feature activation+3.64e-02

Pos logit contributions
ern+0.005
ad+0.005
alia+0.005
 meanings+0.004
artments+0.004
Neg logit contributions
 sudden-0.002
 million-0.002
 divorce-0.002
 little-0.002
 helicopter-0.002
Token seven
Token ID3598
Feature activation-2.56e-01

Pos logit contributions
 sudden+0.044
 divorce+0.043
 million+0.042
 helicopter+0.040
 little+0.039
Neg logit contributions
ern-0.069
ad-0.067
alia-0.063
 meanings-0.061
artments-0.059
Token flowers
Token ID12734
Feature activation-8.00e-02

Pos logit contributions
ern+0.485
ad+0.440
alia+0.438
artments+0.434
ori+0.426
Neg logit contributions
 sudden-0.336
 little-0.316
 million-0.290
 helicopter-0.289
 band-0.269
Token.
Token ID13
Feature activation-4.65e-03

Pos logit contributions
ern+0.174
 meanings+0.169
alia+0.161
 habitats+0.160
eda+0.160
Neg logit contributions
 together-0.074
 they-0.070
 helicopter-0.064
 million-0.063
 little-0.057
Token
Token ID220
Feature activation+2.80e-02

Pos logit contributions
ern+0.009
ad+0.009
 meanings+0.009
alia+0.009
 facts+0.008
Neg logit contributions
 divorce-0.005
 sudden-0.005
 million-0.004
 late-0.004
apy-0.004
Token
Token ID198
Feature activation+3.24e-02

Pos logit contributions
 divorce+0.022
 million+0.019
 late+0.019
 sudden+0.018
 helicopter+0.016
Neg logit contributions
ern-0.064
ad-0.064
alia-0.060
 meanings-0.060
 habitats-0.058
Token
Token ID198
Feature activation+3.29e-02

Pos logit contributions
 divorce+0.026
 sudden+0.024
 million+0.024
 late+0.022
apy+0.019
Neg logit contributions
ern-0.073
ad-0.072
 meanings-0.070
alia-0.069
 habitats-0.067
Token lots
Token ID6041
Feature activation-1.94e-03

Pos logit contributions
ern+0.115
alia+0.106
ad+0.105
io+0.098
iles+0.096
Neg logit contributions
 divorce-0.068
 million-0.067
 sudden-0.065
 little-0.065
 helicopter-0.063
Token of
Token ID286
Feature activation+4.77e-02

Pos logit contributions
ern+0.005
ad+0.005
alia+0.004
 meanings+0.004
iles+0.004
Neg logit contributions
 divorce-0.001
 million-0.001
 sudden-0.001
 helicopter-0.001
 late-0.001
Token words
Token ID2456
Feature activation-1.31e-01

Pos logit contributions
 little+0.059
 million+0.059
 sudden+0.057
 divorce+0.057
 helicopter+0.056
Neg logit contributions
ern-0.094
alia-0.087
ad-0.084
io-0.083
eka-0.081
Token and
Token ID290
Feature activation+2.92e-02

Pos logit contributions
ern+0.258
ad+0.251
alia+0.244
eda+0.232
 meanings+0.229
Neg logit contributions
 divorce-0.150
 million-0.142
 helicopter-0.135
 late-0.135
 together-0.133
Token their
Token ID511
Feature activation+1.32e-02

Pos logit contributions
 divorce+0.041
 sudden+0.041
 million+0.039
 little+0.039
 helicopter+0.038
Neg logit contributions
ern-0.051
ad-0.047
alia-0.046
io-0.044
iles-0.043
Token meanings
Token ID26368
Feature activation-2.54e-01

Pos logit contributions
 little+0.019
 sudden+0.019
 divorce+0.019
 helicopter+0.018
 baby+0.018
Neg logit contributions
ern-0.024
ad-0.022
alia-0.022
iles-0.019
ales-0.019
Token".
Token ID1911
Feature activation+1.70e-02

Pos logit contributions
ern+0.490
alia+0.479
ad+0.472
eda+0.450
 habitats+0.444
Neg logit contributions
 divorce-0.280
 million-0.270
 helicopter-0.261
 together-0.256
 late-0.251
Token
Token ID198
Feature activation+2.72e-02

Pos logit contributions
 divorce+0.017
 sudden+0.016
 million+0.015
 late+0.015
 helicopter+0.014
Neg logit contributions
ad-0.035
ern-0.035
 meanings-0.033
alia-0.033
iles-0.032
Token
Token ID198
Feature activation+2.81e-02

Pos logit contributions
 divorce+0.020
 sudden+0.019
 million+0.018
 late+0.017
 band+0.015
Neg logit contributions
ern-0.062
ad-0.062
 meanings-0.060
alia-0.060
artments-0.058
TokenM
Token ID44
Feature activation-1.33e-02

Pos logit contributions
 divorce+0.033
 sudden+0.032
 million+0.031
 late+0.029
 helicopter+0.027
Neg logit contributions
ern-0.053
ad-0.052
 meanings-0.050
alia-0.050
 facts-0.048
Tokenolly
Token ID5098
Feature activation-7.77e-02

Pos logit contributions
ern+0.020
ad+0.019
artments+0.018
alia+0.018
et+0.018
Neg logit contributions
 divorce-0.020
 sudden-0.019
 million-0.018
 helicopter-0.018
 little-0.018
Token,
Token ID11
Feature activation-7.85e-03

Pos logit contributions
ern+0.167
ad+0.165
alia+0.155
 habitats+0.151
artments+0.150
Neg logit contributions
 sudden-0.088
 million-0.079
 little-0.076
 helicopter-0.072
 divorce-0.071
Token five
Token ID1936
Feature activation-1.30e-01

Pos logit contributions
ern+0.012
ad+0.012
 habitats+0.012
alia+0.012
 meanings+0.011
Neg logit contributions
 sudden-0.011
 million-0.011
 little-0.011
 helicopter-0.011
 band-0.010
Token,
Token ID11
Feature activation-4.91e-03

Pos logit contributions
ern+0.266
ad+0.263
alia+0.246
 habitats+0.241
artments+0.240
Neg logit contributions
 sudden-0.142
 million-0.132
 little-0.129
 helicopter-0.119
 divorce-0.117
Token six
Token ID2237
Feature activation-1.57e-01

Pos logit contributions
ern+0.007
ad+0.007
alia+0.007
 habitats+0.007
 meanings+0.007
Neg logit contributions
 sudden-0.007
 million-0.007
 helicopter-0.007
 little-0.007
 divorce-0.007
Token,
Token ID11
Feature activation-4.30e-03

Pos logit contributions
ern+0.325
ad+0.318
alia+0.298
artments+0.298
 habitats+0.297
Neg logit contributions
 sudden-0.172
 little-0.157
 million-0.152
 band-0.142
 helicopter-0.133
Token seven
Token ID3598
Feature activation-2.53e-01

Pos logit contributions
ad+0.004
ern+0.004
alia+0.004
 meanings+0.004
 habitats+0.004
Neg logit contributions
 million-0.009
 sudden-0.009
 divorce-0.009
 helicopter-0.009
 band-0.008
Token,
Token ID11
Feature activation-2.85e-03

Pos logit contributions
ern+0.532
ad+0.498
artments+0.489
 habitats+0.475
alia+0.475
Neg logit contributions
 sudden-0.258
 little-0.250
 million-0.232
 band-0.224
 helicopter-0.218
Token eight
Token ID3624
Feature activation-7.10e-02

Pos logit contributions
ad+0.004
ern+0.004
alia+0.004
 meanings+0.003
 habitats+0.003
Neg logit contributions
 divorce-0.005
 million-0.005
 sudden-0.005
 helicopter-0.004
burn-0.004
Token.
Token ID13
Feature activation-6.18e-03

Pos logit contributions
ern+0.155
ad+0.150
 habitats+0.145
artments+0.143
 meanings+0.142
Neg logit contributions
 sudden-0.065
 million-0.062
 little-0.060
 band-0.055
 helicopter-0.050
Token She
Token ID1375
Feature activation-6.33e-02

Pos logit contributions
ad+0.011
ern+0.011
alia+0.010
 meanings+0.010
iles+0.010
Neg logit contributions
 divorce-0.008
 million-0.008
 late-0.007
 sudden-0.007
apy-0.007
Token played
Token ID2826
Feature activation+4.61e-02

Pos logit contributions
ern+0.141
ales+0.137
ad+0.133
ocol+0.133
wered+0.132
Neg logit contributions
 million-0.049
 sudden-0.046
 little-0.046
 blew-0.045
 broke-0.043
Token,
Token ID11
Feature activation-2.50e-03

Pos logit contributions
ern+0.186
ad+0.183
 habitats+0.178
alia+0.176
iddle+0.173
Neg logit contributions
 sudden-0.062
 million-0.062
 little-0.059
 helicopter-0.049
 they-0.047
Token five
Token ID1936
Feature activation-1.28e-01

Pos logit contributions
 habitats+0.004
ern+0.004
ad+0.004
alia+0.004
dad+0.004
Neg logit contributions
 little-0.003
 sudden-0.003
 million-0.003
 band-0.003
 helicopter-0.003
Token,
Token ID11
Feature activation-4.68e-04

Pos logit contributions
ern+0.278
ad+0.275
alia+0.261
 habitats+0.261
 meanings+0.253
Neg logit contributions
 million-0.122
 sudden-0.117
 little-0.116
 helicopter-0.102
 band-0.097
Token six
Token ID2237
Feature activation-1.54e-01

Pos logit contributions
ern+0.001
ad+0.001
 habitats+0.001
alia+0.001
dad+0.001
Neg logit contributions
 little-0.001
 sudden-0.001
 million-0.001
 helicopter-0.001
 band-0.001
Token,
Token ID11
Feature activation-1.19e-04

Pos logit contributions
ern+0.342
ad+0.334
 habitats+0.325
alia+0.323
artments+0.317
Neg logit contributions
 million-0.133
 sudden-0.130
 little-0.130
 band-0.105
 helicopter-0.104
Token seven
Token ID3598
Feature activation-2.53e-01

Pos logit contributions
ad+0.000
ern+0.000
alia+0.000
 habitats+0.000
 meanings+0.000
Neg logit contributions
 million-0.000
 sudden-0.000
 helicopter-0.000
 little-0.000
 divorce-0.000
Token,
Token ID11
Feature activation+9.57e-04

Pos logit contributions
ern+0.590
 habitats+0.556
ad+0.551
artments+0.549
alia+0.546
Neg logit contributions
 little-0.190
 million-0.179
 sudden-0.168
 we-0.157
 helicopter-0.149
Token eight
Token ID3624
Feature activation-7.10e-02

Pos logit contributions
 million+0.002
 divorce+0.002
 sudden+0.002
 helicopter+0.001
 little+0.001
Neg logit contributions
ern-0.001
ad-0.001
alia-0.001
 meanings-0.001
 habitats-0.001
Token,
Token ID11
Feature activation+2.89e-03

Pos logit contributions
ern+0.161
ad+0.156
 habitats+0.154
 meanings+0.150
alia+0.150
Neg logit contributions
 million-0.058
 little-0.054
 sudden-0.053
 band-0.047
 we-0.046
Token nine
Token ID5193
Feature activation-3.62e-02

Pos logit contributions
 divorce+0.004
 million+0.004
 sudden+0.004
 late+0.004
 helicopter+0.004
Neg logit contributions
ern-0.004
ad-0.004
alia-0.004
 meanings-0.004
 habitats-0.004
Token,
Token ID11
Feature activation+5.53e-03

Pos logit contributions
ern+0.083
ad+0.080
alia+0.079
eda+0.078
ori+0.077
Neg logit contributions
 million-0.027
 little-0.026
 sudden-0.025
 we-0.022
 together-0.021
Token counted
Token ID14789
Feature activation-1.17e-01

Pos logit contributions
ern+0.124
ad+0.120
alia+0.114
 habitats+0.112
ocol+0.111
Neg logit contributions
 sudden-0.067
 million-0.062
 little-0.062
 sank-0.058
 together-0.058
Token five
Token ID1936
Feature activation-1.33e-01

Pos logit contributions
ern+0.260
iley+0.245
alia+0.244
ad+0.244
 habitats+0.242
Neg logit contributions
 together-0.109
 little-0.100
 they-0.095
 baby-0.086
 sudden-0.085
Token,
Token ID11
Feature activation-6.78e-03

Pos logit contributions
ad+0.289
ern+0.283
 habitats+0.266
 meanings+0.260
alia+0.259
Neg logit contributions
 sudden-0.134
 million-0.131
 little-0.130
 together-0.103
 while-0.101
Token six
Token ID2237
Feature activation-1.58e-01

Pos logit contributions
ern+0.011
ad+0.011
 habitats+0.010
 meanings+0.010
alia+0.010
Neg logit contributions
 sudden-0.010
 little-0.010
 million-0.009
 together-0.009
 late-0.008
Token,
Token ID11
Feature activation-5.97e-03

Pos logit contributions
ad+0.343
ern+0.342
 habitats+0.322
 meanings+0.320
alia+0.316
Neg logit contributions
 sudden-0.151
 little-0.145
 million-0.138
 while-0.108
 accidents-0.107
Token seven
Token ID3598
Feature activation-2.52e-01

Pos logit contributions
ad+0.005
ern+0.005
 meanings+0.004
alia+0.004
 habitats+0.004
Neg logit contributions
 million-0.013
 sudden-0.013
 little-0.012
 helicopter-0.012
 divorce-0.012
Token,
Token ID11
Feature activation-5.33e-03

Pos logit contributions
ern+0.571
ad+0.547
artments+0.529
alia+0.517
 meanings+0.517
Neg logit contributions
 little-0.217
 sudden-0.213
 million-0.187
 we-0.168
 accidents-0.155
Token eight
Token ID3624
Feature activation-7.17e-02

Pos logit contributions
ad+0.007
ern+0.007
alia+0.007
 meanings+0.007
 habitats+0.006
Neg logit contributions
 million-0.009
 divorce-0.009
 sudden-0.009
 helicopter-0.008
burn-0.008
Token.
Token ID13
Feature activation-1.40e-02

Pos logit contributions
ern+0.158
ad+0.155
 meanings+0.147
 habitats+0.146
alia+0.146
Neg logit contributions
 sudden-0.064
 million-0.061
 little-0.060
 band-0.049
 we-0.049
Token He
Token ID679
Feature activation-5.82e-02

Pos logit contributions
ad+0.026
ern+0.025
alia+0.024
 meanings+0.023
iles+0.022
Neg logit contributions
 divorce-0.018
 million-0.017
 sudden-0.016
 late-0.016
burn-0.015
Token even
Token ID772
Feature activation-4.61e-03

Pos logit contributions
ern+0.118
ad+0.114
ocol+0.112
 habitats+0.111
alia+0.108
Neg logit contributions
 sudden-0.057
 little-0.056
 million-0.055
 together-0.052
 sank-0.049
Token,
Token ID11
Feature activation-1.97e-03

Pos logit contributions
ern+0.179
ad+0.179
alia+0.171
 habitats+0.170
 meanings+0.168
Neg logit contributions
 million-0.069
 little-0.061
 sudden-0.061
 we-0.056
 helicopter-0.053
Token five
Token ID1936
Feature activation-1.29e-01

Pos logit contributions
ad+0.003
alia+0.003
ern+0.003
 habitats+0.003
dad+0.003
Neg logit contributions
 little-0.003
 million-0.003
 helicopter-0.003
 band-0.003
 sudden-0.003
Token,
Token ID11
Feature activation-1.00e-03

Pos logit contributions
ad+0.280
ern+0.278
alia+0.263
 meanings+0.260
 habitats+0.258
Neg logit contributions
 million-0.121
 little-0.112
 sudden-0.112
 helicopter-0.097
 we-0.096
Token six
Token ID2237
Feature activation-1.54e-01

Pos logit contributions
ad+0.001
ern+0.001
alia+0.001
 meanings+0.001
 habitats+0.001
Neg logit contributions
 little-0.002
 million-0.002
 helicopter-0.002
 band-0.001
 sudden-0.001
Token,
Token ID11
Feature activation-4.83e-04

Pos logit contributions
ad+0.350
ern+0.349
 meanings+0.333
alia+0.332
 habitats+0.331
Neg logit contributions
 million-0.119
 little-0.117
 sudden-0.111
 we-0.094
 together-0.085
Token seven
Token ID3598
Feature activation-2.52e-01

Pos logit contributions
ad+0.000
ern+0.000
alia+0.000
 meanings+0.000
 habitats+0.000
Neg logit contributions
 million-0.001
 little-0.001
 sudden-0.001
 helicopter-0.001
 band-0.001
Token,
Token ID11
Feature activation-2.93e-04

Pos logit contributions
ern+0.597
ad+0.570
artments+0.559
ipment+0.555
 habitats+0.555
Neg logit contributions
 little-0.173
 million-0.153
 sudden-0.148
 we-0.145
 helicopter-0.118
Token eight
Token ID3624
Feature activation-7.07e-02

Pos logit contributions
ad+0.000
ern+0.000
alia+0.000
 meanings+0.000
 habitats+0.000
Neg logit contributions
 million-0.000
 divorce-0.000
 sudden-0.000
 helicopter-0.000
 little-0.000
Token,
Token ID11
Feature activation+7.25e-05

Pos logit contributions
ern+0.164
ad+0.161
 meanings+0.154
 habitats+0.153
alia+0.153
Neg logit contributions
 million-0.053
 little-0.050
 sudden-0.048
 we-0.043
 helicopter-0.039
Token nine
Token ID5193
Feature activation-3.70e-02

Pos logit contributions
 divorce+0.000
 million+0.000
 sudden+0.000
 helicopter+0.000
burn+0.000
Neg logit contributions
ad-0.000
ern-0.000
 meanings-0.000
alia-0.000
 habitats-0.000
Token,
Token ID11
Feature activation+2.27e-03

Pos logit contributions
ern+0.086
ad+0.085
wered+0.082
alia+0.081
ori+0.079
Neg logit contributions
 million-0.026
 little-0.025
 sudden-0.022
 we-0.022
 together-0.019
Token Amber
Token ID21896
Feature activation-1.04e-01

Pos logit contributions
ern+0.003
ad+0.003
 meanings+0.003
alia+0.003
iles+0.003
Neg logit contributions
 divorce-0.001
 sudden-0.001
 million-0.001
 late-0.001
 helicopter-0.001
Token had
Token ID550
Feature activation-4.08e-02

Pos logit contributions
ern+0.209
ad+0.206
 meanings+0.197
alia+0.189
iles+0.187
Neg logit contributions
 divorce-0.112
 million-0.106
 sudden-0.105
 late-0.097
 band-0.096
Token a
Token ID257
Feature activation+4.15e-02

Pos logit contributions
ern+0.081
ad+0.078
iles+0.077
ani+0.076
 meanings+0.075
Neg logit contributions
 million-0.043
 divorce-0.043
 late-0.041
 sudden-0.041
 band-0.039
Token weight
Token ID3463
Feature activation-4.21e-02

Pos logit contributions
 divorce+0.051
 million+0.045
 sudden+0.044
burn+0.044
 helicopter+0.042
Neg logit contributions
ad-0.078
ern-0.078
alia-0.074
 meanings-0.072
iles-0.068
Token of
Token ID286
Feature activation+6.43e-02

Pos logit contributions
ern+0.086
 meanings+0.084
ad+0.084
 habitats+0.082
alia+0.079
Neg logit contributions
 divorce-0.043
 million-0.041
burn-0.040
 band-0.039
 late-0.039
Token seven
Token ID3598
Feature activation-2.52e-01

Pos logit contributions
 band+0.071
 baby+0.066
 little+0.065
 sudden+0.064
 better+0.064
Neg logit contributions
ani-0.126
alia-0.125
ern-0.123
eka-0.121
io-0.119
Token and
Token ID290
Feature activation+3.98e-02

Pos logit contributions
ern+0.463
ori+0.435
alia+0.435
ad+0.426
ipment+0.417
Neg logit contributions
 sudden-0.333
 little-0.306
 million-0.299
 band-0.278
 helicopter-0.278
Token Bella
Token ID39304
Feature activation-4.51e-02

Pos logit contributions
 divorce+0.044
 sudden+0.044
 million+0.043
 little+0.042
 late+0.041
Neg logit contributions
ad-0.076
ern-0.075
iles-0.071
 meanings-0.069
 habitats-0.069
Token had
Token ID550
Feature activation-4.24e-02

Pos logit contributions
ern+0.089
ad+0.087
iles+0.086
wered+0.086
 meanings+0.084
Neg logit contributions
 million-0.045
 sudden-0.043
 little-0.041
 divorce-0.040
 sank-0.040
Token a
Token ID257
Feature activation+4.20e-02

Pos logit contributions
ern+0.082
iles+0.079
ad+0.078
alia+0.078
ani+0.077
Neg logit contributions
 million-0.047
 sudden-0.043
 divorce-0.042
 band-0.041
 little-0.041
Token weight
Token ID3463
Feature activation-4.16e-02

Pos logit contributions
 divorce+0.050
 million+0.045
 sudden+0.044
burn+0.042
 helicopter+0.042
Neg logit contributions
ad-0.082
ern-0.081
alia-0.077
 meanings-0.074
iles-0.072
Token,
Token ID11
Feature activation-7.12e-03

Pos logit contributions
ern+0.192
ad+0.190
alia+0.183
ori+0.180
 habitats+0.179
Neg logit contributions
 sudden-0.064
 million-0.063
 little-0.061
 helicopter-0.054
 we-0.049
Token five
Token ID1936
Feature activation-1.31e-01

Pos logit contributions
ad+0.011
ern+0.011
 habitats+0.010
dad+0.010
alia+0.010
Neg logit contributions
 little-0.010
 sudden-0.010
 helicopter-0.010
 band-0.010
 million-0.010
Token,
Token ID11
Feature activation-5.87e-03

Pos logit contributions
ad+0.288
ern+0.287
alia+0.272
 habitats+0.264
ori+0.261
Neg logit contributions
 million-0.117
 sudden-0.114
 little-0.111
 helicopter-0.101
 divorce-0.096
Token six
Token ID2237
Feature activation-1.55e-01

Pos logit contributions
ern+0.009
ad+0.009
alia+0.008
dad+0.008
 habitats+0.008
Neg logit contributions
 little-0.008
 sudden-0.008
 helicopter-0.008
 together-0.008
 million-0.008
Token,
Token ID11
Feature activation-5.11e-03

Pos logit contributions
ern+0.355
ad+0.352
alia+0.340
 habitats+0.333
ori+0.332
Neg logit contributions
 sudden-0.117
 million-0.117
 little-0.115
 helicopter-0.094
 band-0.090
Token seven
Token ID3598
Feature activation-2.52e-01

Pos logit contributions
ad+0.005
ern+0.005
alia+0.004
 meanings+0.004
 habitats+0.004
Neg logit contributions
 million-0.011
 sudden-0.010
 helicopter-0.010
 little-0.010
 divorce-0.010
Token,
Token ID11
Feature activation-4.50e-03

Pos logit contributions
ern+0.602
ad+0.571
ori+0.567
alia+0.564
artments+0.564
Neg logit contributions
 little-0.166
 million-0.155
 sudden-0.153
 helicopter-0.134
 we-0.128
Token eight
Token ID3624
Feature activation-7.09e-02

Pos logit contributions
ad+0.006
ern+0.006
alia+0.006
 meanings+0.006
 habitats+0.006
Neg logit contributions
 million-0.007
 divorce-0.007
 sudden-0.007
 helicopter-0.007
 little-0.007
Token,
Token ID11
Feature activation-4.33e-03

Pos logit contributions
ern+0.168
ad+0.166
 habitats+0.160
ori+0.159
alia+0.159
Neg logit contributions
 million-0.047
 little-0.046
 sudden-0.044
 band-0.037
 helicopter-0.036
Token nine
Token ID5193
Feature activation-3.72e-02

Pos logit contributions
ad+0.006
ern+0.006
alia+0.005
 meanings+0.005
 habitats+0.005
Neg logit contributions
 divorce-0.007
 million-0.007
 sudden-0.007
 helicopter-0.007
burn-0.007
Token,
Token ID11
Feature activation-1.88e-03

Pos logit contributions
ern+0.088
ad+0.086
wered+0.084
ori+0.083
alia+0.083
Neg logit contributions
 little-0.024
 million-0.022
 sudden-0.022
 accidents-0.018
 we-0.018
Token the
Token ID262
Feature activation+1.65e-02

Pos logit contributions
 divorce+0.172
 sudden+0.156
burn+0.154
 million+0.152
 helicopter+0.148
Neg logit contributions
ad-0.165
ern-0.160
alia-0.145
 meanings-0.133
et-0.132
Token chair
Token ID5118
Feature activation+2.81e-02

Pos logit contributions
 divorce+0.024
 sudden+0.022
 million+0.022
burn+0.022
 helicopter+0.021
Neg logit contributions
ad-0.027
ern-0.027
alia-0.025
 meanings-0.024
iles-0.023
Token.
Token ID13
Feature activation+1.52e-03

Pos logit contributions
 divorce+0.039
 million+0.036
 sudden+0.035
burn+0.035
 helicopter+0.034
Neg logit contributions
ad-0.048
ern-0.048
alia-0.045
 meanings-0.044
iles-0.042
Token The
Token ID383
Feature activation+2.85e-02

Pos logit contributions
 divorce+0.003
 million+0.002
 sudden+0.002
burn+0.002
 helicopter+0.002
Neg logit contributions
ad-0.002
ern-0.002
alia-0.002
 meanings-0.002
iles-0.002
Token cup
Token ID6508
Feature activation+4.71e-03

Pos logit contributions
 divorce+0.032
 band+0.029
 sudden+0.029
 million+0.029
 helicopter+0.028
Neg logit contributions
ern-0.058
ad-0.057
alia-0.054
 meanings-0.053
 habitats-0.052
Token fell
Token ID3214
Feature activation+1.55e-01

Pos logit contributions
 divorce+0.005
 sudden+0.005
 million+0.004
burn+0.004
 helicopter+0.004
Neg logit contributions
ad-0.010
ern-0.009
alia-0.009
 meanings-0.008
et-0.008
Token off
Token ID572
Feature activation+1.08e-01

Pos logit contributions
 divorce+0.275
 sudden+0.248
 helicopter+0.205
 million+0.204
apy+0.202
Neg logit contributions
ad-0.231
ern-0.221
et-0.202
 herself-0.197
alia-0.192
Token the
Token ID262
Feature activation+1.72e-02

Pos logit contributions
 divorce+0.183
 sudden+0.166
burn+0.163
 million+0.158
 helicopter+0.155
Neg logit contributions
ad-0.160
ern-0.155
alia-0.140
et-0.128
 meanings-0.127
Token table
Token ID3084
Feature activation-3.07e-02

Pos logit contributions
 divorce+0.027
 sudden+0.025
 million+0.025
burn+0.024
 helicopter+0.024
Neg logit contributions
ad-0.026
ern-0.026
alia-0.024
 meanings-0.023
iles-0.022
Token.
Token ID13
Feature activation+9.81e-04

Pos logit contributions
ad+0.050
ern+0.049
alia+0.046
 meanings+0.044
iles+0.042
Neg logit contributions
 divorce-0.046
 million-0.042
 sudden-0.042
burn-0.041
 helicopter-0.040
Token The
Token ID383
Feature activation+2.78e-02

Pos logit contributions
 divorce+0.002
burn+0.002
 sudden+0.002
 million+0.002
 helicopter+0.002
Neg logit contributions
ern-0.001
ad-0.001
 meanings-0.001
alia-0.001
 poses-0.001
Token way
Token ID835
Feature activation+1.20e-01

Pos logit contributions
ad+0.091
ern+0.091
alia+0.085
 meanings+0.083
iles+0.079
Neg logit contributions
 divorce-0.067
 million-0.060
 sudden-0.060
burn-0.059
 helicopter-0.057
Token and
Token ID290
Feature activation+3.60e-02

Pos logit contributions
 divorce+0.147
 million+0.132
 sudden+0.131
burn+0.129
 helicopter+0.127
Neg logit contributions
ern-0.224
ad-0.222
alia-0.212
 meanings-0.209
iles-0.198
Token eventually
Token ID4191
Feature activation+5.42e-02

Pos logit contributions
 divorce+0.050
 sudden+0.046
 million+0.046
burn+0.044
 late+0.044
Neg logit contributions
ern-0.062
ad-0.062
alia-0.058
 meanings-0.056
iles-0.053
Token Joe
Token ID5689
Feature activation-7.00e-02

Pos logit contributions
 divorce+0.070
 million+0.063
 sudden+0.063
burn+0.062
 helicopter+0.060
Neg logit contributions
ad-0.100
ern-0.098
alia-0.093
 meanings-0.089
iles-0.086
Token was
Token ID373
Feature activation+1.50e-02

Pos logit contributions
ad+0.131
ern+0.127
alia+0.121
 meanings+0.115
et+0.113
Neg logit contributions
 divorce-0.089
 million-0.079
 sudden-0.078
burn-0.077
 helicopter-0.076
Token able
Token ID1498
Feature activation+1.25e-01

Pos logit contributions
 divorce+0.018
 sudden+0.016
 million+0.016
burn+0.015
 helicopter+0.015
Neg logit contributions
ad-0.029
ern-0.028
alia-0.026
et-0.025
 meanings-0.025
Token to
Token ID284
Feature activation-2.23e-02

Pos logit contributions
 divorce+0.164
 million+0.146
burn+0.144
 sudden+0.143
 helicopter+0.141
Neg logit contributions
ad-0.225
ern-0.218
alia-0.206
 meanings-0.201
iles-0.192
Token find
Token ID1064
Feature activation-7.29e-02

Pos logit contributions
ad+0.037
ern+0.037
alia+0.035
 meanings+0.034
iles+0.032
Neg logit contributions
 divorce-0.032
 million-0.029
 sudden-0.029
burn-0.029
 helicopter-0.028
Token his
Token ID465
Feature activation-1.58e-02

Pos logit contributions
ern+0.141
ad+0.135
alia+0.131
 meanings+0.127
 habitats+0.123
Neg logit contributions
 divorce-0.090
 sudden-0.080
 million-0.079
 helicopter-0.078
 late-0.076
Token way
Token ID835
Feature activation+1.19e-01

Pos logit contributions
ad+0.026
ern+0.025
 meanings+0.024
alia+0.023
et+0.023
Neg logit contributions
 divorce-0.022
 million-0.021
burn-0.020
 sudden-0.020
 late-0.019
Token home
Token ID1363
Feature activation-2.91e-02

Pos logit contributions
 divorce+0.157
 million+0.142
 sudden+0.140
burn+0.136
 helicopter+0.133
Neg logit contributions
ad-0.213
ern-0.210
alia-0.197
 meanings-0.188
iles-0.182
Token time
Token ID640
Feature activation+3.23e-02

Pos logit contributions
 divorce+0.034
 million+0.030
 sudden+0.030
burn+0.030
 helicopter+0.029
Neg logit contributions
ad-0.045
ern-0.045
alia-0.042
 meanings-0.040
iles-0.039
Token,
Token ID11
Feature activation-1.38e-02

Pos logit contributions
 divorce+0.042
 million+0.038
 sudden+0.038
burn+0.037
 helicopter+0.036
Neg logit contributions
ad-0.059
ern-0.058
alia-0.055
 meanings-0.053
iles-0.051
Token there
Token ID612
Feature activation+4.05e-02

Pos logit contributions
ad+0.024
ern+0.024
alia+0.022
iles+0.021
et+0.021
Neg logit contributions
 divorce-0.019
burn-0.017
 million-0.015
 sudden-0.015
 band-0.015
Token was
Token ID373
Feature activation-4.85e-03

Pos logit contributions
 divorce+0.037
 million+0.033
 sudden+0.033
 helicopter+0.030
apy+0.029
Neg logit contributions
ern-0.088
ad-0.086
alia-0.084
 meanings-0.083
artments-0.080
Token a
Token ID257
Feature activation+2.72e-02

Pos logit contributions
ad+0.007
ern+0.007
alia+0.006
 meanings+0.006
iles+0.005
Neg logit contributions
 divorce-0.008
 million-0.008
burn-0.007
 helicopter-0.007
 sank-0.007
Token little
Token ID1310
Feature activation+1.44e-01

Pos logit contributions
 divorce+0.030
burn+0.027
 million+0.025
 sank+0.024
 unst+0.024
Neg logit contributions
ad-0.049
alia-0.047
iley-0.046
ern-0.046
 seven-0.046
Token girl
Token ID2576
Feature activation+4.72e-02

Pos logit contributions
 divorce+0.171
 million+0.145
 sudden+0.143
 sank+0.136
burn+0.134
Neg logit contributions
ad-0.273
alia-0.263
ern-0.259
iley-0.248
iles-0.246
Token called
Token ID1444
Feature activation-4.27e-02

Pos logit contributions
 divorce+0.064
 million+0.058
 sudden+0.056
burn+0.055
 helicopter+0.054
Neg logit contributions
ad-0.085
ern-0.084
alia-0.078
 meanings-0.075
et-0.072
Token Amy
Token ID14235
Feature activation-1.30e-02

Pos logit contributions
ad+0.078
ern+0.078
alia+0.073
 meanings+0.071
iles+0.069
Neg logit contributions
 divorce-0.055
 million-0.049
 sudden-0.049
burn-0.048
 helicopter-0.046
Token who
Token ID508
Feature activation+1.19e-03

Pos logit contributions
ern+0.024
ad+0.024
alia+0.022
 meanings+0.022
iles+0.021
Neg logit contributions
 divorce-0.016
 million-0.015
burn-0.014
 sudden-0.014
 helicopter-0.014
Token had
Token ID550
Feature activation-6.08e-02

Pos logit contributions
 divorce+0.001
 late+0.001
 million+0.001
 helicopter+0.001
 band+0.001
Neg logit contributions
ern-0.002
ad-0.002
alia-0.002
iles-0.002
ales-0.002
Token share
Token ID2648
Feature activation-2.85e-02

Pos logit contributions
ad+0.045
ern+0.045
alia+0.042
 meanings+0.041
iles+0.040
Neg logit contributions
 divorce-0.039
 million-0.035
burn-0.034
 sudden-0.034
 helicopter-0.033
Token.
Token ID13
Feature activation-4.60e-03

Pos logit contributions
ern+0.057
ad+0.056
alia+0.055
iles+0.054
 meanings+0.052
Neg logit contributions
 divorce-0.032
 million-0.030
 late-0.029
 better-0.028
 together-0.028
Token Her
Token ID2332
Feature activation+7.00e-03

Pos logit contributions
ad+0.008
ern+0.008
alia+0.008
 meanings+0.007
iles+0.007
Neg logit contributions
 divorce-0.006
 million-0.006
 sudden-0.006
burn-0.005
 helicopter-0.005
Token friends
Token ID2460
Feature activation-1.32e-03

Pos logit contributions
 divorce+0.009
 million+0.008
 sudden+0.008
burn+0.007
 late+0.007
Neg logit contributions
ad-0.013
ern-0.013
alia-0.012
 meanings-0.012
 habitats-0.012
Token felt
Token ID2936
Feature activation-1.52e-02

Pos logit contributions
ad+0.002
ern+0.002
alia+0.002
 meanings+0.002
iles+0.002
Neg logit contributions
 divorce-0.002
 million-0.002
 sudden-0.002
burn-0.002
 helicopter-0.002
Token sad
Token ID6507
Feature activation+1.23e-01

Pos logit contributions
ad+0.035
ern+0.034
alia+0.033
 meanings+0.032
 facts+0.031
Neg logit contributions
 divorce-0.012
burn-0.011
 million-0.010
 sudden-0.010
 helicopter-0.009
Token and
Token ID290
Feature activation+3.54e-02

Pos logit contributions
 divorce+0.167
 million+0.151
 sudden+0.150
burn+0.146
 helicopter+0.143
Neg logit contributions
ad-0.219
ern-0.218
alia-0.204
 meanings-0.198
iles-0.190
Token left
Token ID1364
Feature activation-2.69e-02

Pos logit contributions
 divorce+0.039
 million+0.036
burn+0.035
 sudden+0.035
 helicopter+0.033
Neg logit contributions
ad-0.070
ern-0.069
alia-0.065
 meanings-0.064
 habitats-0.061
Token the
Token ID262
Feature activation+8.79e-03

Pos logit contributions
ad+0.045
ern+0.045
alia+0.042
 meanings+0.040
iles+0.039
Neg logit contributions
 divorce-0.039
 million-0.036
 sudden-0.035
burn-0.035
 helicopter-0.034
Token party
Token ID2151
Feature activation-2.13e-05

Pos logit contributions
apy+0.014
 divorce+0.014
burn+0.014
 million+0.013
 sudden+0.013
Neg logit contributions
ad-0.012
ern-0.011
alia-0.011
 meanings-0.010
 habitats-0.010
Token early
Token ID1903
Feature activation+6.51e-02

Pos logit contributions
ad+0.000
ern+0.000
alia+0.000
 meanings+0.000
iles+0.000
Neg logit contributions
 divorce-0.000
 million-0.000
 sudden-0.000
burn-0.000
 helicopter-0.000
Token time
Token ID640
Feature activation+3.47e-02

Pos logit contributions
 divorce+0.042
 million+0.038
burn+0.037
 helicopter+0.037
 sudden+0.037
Neg logit contributions
ad-0.047
ern-0.047
alia-0.043
 meanings-0.042
iles-0.040
Token,
Token ID11
Feature activation-1.35e-02

Pos logit contributions
 divorce+0.055
burn+0.050
 sudden+0.050
 million+0.049
 helicopter+0.047
Neg logit contributions
ad-0.055
ern-0.053
alia-0.049
iles-0.046
 meanings-0.045
Token there
Token ID612
Feature activation+4.25e-02

Pos logit contributions
ern+0.020
ad+0.020
 seven+0.018
alia+0.018
 Sof+0.018
Neg logit contributions
 divorce-0.021
burn-0.019
 million-0.017
 sudden-0.017
broken-0.017
Token was
Token ID373
Feature activation-2.99e-03

Pos logit contributions
 divorce+0.051
 million+0.045
 sudden+0.045
burn+0.044
 helicopter+0.043
Neg logit contributions
ad-0.082
ern-0.081
alia-0.077
 meanings-0.075
iles-0.072
Token a
Token ID257
Feature activation+2.97e-02

Pos logit contributions
ad+0.004
ern+0.003
alia+0.003
 herself+0.003
wered+0.003
Neg logit contributions
 divorce-0.006
 million-0.005
 sank-0.005
burn-0.005
broken-0.005
Token little
Token ID1310
Feature activation+1.46e-01

Pos logit contributions
 divorce+0.035
burn+0.031
 sank+0.028
 unst+0.028
 million+0.027
Neg logit contributions
ad-0.051
alia-0.049
 seven-0.049
iley-0.049
eda-0.048
Token boy
Token ID2933
Feature activation+4.25e-02

Pos logit contributions
 divorce+0.194
 sank+0.163
 sudden+0.163
 million+0.159
 unst+0.156
Neg logit contributions
ad-0.251
alia-0.244
ern-0.231
iley-0.230
iles-0.227
Token named
Token ID3706
Feature activation+4.28e-02

Pos logit contributions
 divorce+0.061
 million+0.055
 sudden+0.053
burn+0.052
 helicopter+0.051
Neg logit contributions
ad-0.075
ern-0.072
alia-0.067
 meanings-0.063
iles-0.063
Token Tim
Token ID5045
Feature activation+2.83e-02

Pos logit contributions
 divorce+0.062
 sudden+0.059
burn+0.057
 million+0.056
 helicopter+0.055
Neg logit contributions
ad-0.071
ern-0.069
alia-0.063
iles-0.062
 Lucas-0.061
Tokenmy
Token ID1820
Feature activation+9.96e-03

Pos logit contributions
 divorce+0.041
 million+0.038
 sudden+0.037
burn+0.037
 helicopter+0.036
Neg logit contributions
ern-0.046
ad-0.046
alia-0.043
 meanings-0.042
iles-0.040
Token.
Token ID13
Feature activation-2.14e-02

Pos logit contributions
 divorce+0.019
 million+0.017
 sudden+0.017
apy+0.016
burn+0.016
Neg logit contributions
ad-0.013
ern-0.012
alia-0.011
 meanings-0.011
et-0.011
Token a
Token ID257
Feature activation+3.94e-02

Pos logit contributions
ern+0.207
ad+0.205
alia+0.198
 meanings+0.186
iles+0.185
Neg logit contributions
 divorce-0.133
 million-0.119
 sudden-0.118
 helicopter-0.115
burn-0.109
Token magic
Token ID5536
Feature activation-3.12e-02

Pos logit contributions
 divorce+0.062
 late+0.060
 million+0.059
burn+0.059
 sudden+0.058
Neg logit contributions
ad-0.051
ern-0.051
 poses-0.049
alia-0.048
 Julia-0.046
Token trick
Token ID6908
Feature activation-6.47e-02

Pos logit contributions
ern+0.048
ad+0.047
alia+0.044
 meanings+0.043
 habitats+0.042
Neg logit contributions
 divorce-0.050
 helicopter-0.046
 million-0.045
 band-0.045
 sudden-0.044
Token.
Token ID13
Feature activation-1.56e-02

Pos logit contributions
ern+0.125
ad+0.121
 habitats+0.118
 meanings+0.118
alia+0.118
Neg logit contributions
 divorce-0.075
 band-0.067
burn-0.066
 helicopter-0.066
 together-0.066
Token But
Token ID887
Feature activation-4.51e-02

Pos logit contributions
ad+0.028
ern+0.028
alia+0.027
 meanings+0.025
iles+0.025
Neg logit contributions
 divorce-0.020
 million-0.018
 sudden-0.018
 helicopter-0.018
 late-0.018
Token it
Token ID340
Feature activation+8.08e-02

Pos logit contributions
ad+0.078
ern+0.076
alia+0.071
 meanings+0.069
iles+0.066
Neg logit contributions
 divorce-0.063
 million-0.058
 sudden-0.057
burn-0.057
 helicopter-0.054
Token didn
Token ID1422
Feature activation+8.05e-02

Pos logit contributions
 divorce+0.110
 million+0.098
 sudden+0.098
burn+0.096
 helicopter+0.093
Neg logit contributions
ad-0.144
ern-0.141
alia-0.133
 meanings-0.128
et-0.124
Token't
Token ID470
Feature activation+5.30e-02

Pos logit contributions
 divorce+0.115
 sudden+0.109
 million+0.103
burn+0.102
 helicopter+0.098
Neg logit contributions
ad-0.143
ern-0.137
alia-0.120
 meanings-0.115
wered-0.111
Token work
Token ID670
Feature activation-1.28e-02

Pos logit contributions
 divorce+0.083
 million+0.076
 sudden+0.074
burn+0.074
 helicopter+0.072
Neg logit contributions
ad-0.083
ern-0.083
alia-0.077
 meanings-0.074
 habitats-0.071
Token the
Token ID262
Feature activation+6.60e-03

Pos logit contributions
ad+0.021
ern+0.020
alia+0.019
 meanings+0.018
et+0.017
Neg logit contributions
 divorce-0.020
 sudden-0.018
 million-0.018
burn-0.017
 helicopter-0.017
Token first
Token ID717
Feature activation-7.99e-02

Pos logit contributions
 divorce+0.010
 million+0.009
 sudden+0.009
burn+0.009
 helicopter+0.009
Neg logit contributions
ad-0.010
ern-0.010
alia-0.010
 meanings-0.009
iles-0.009
Token hand
Token ID1021
Feature activation+2.39e-02

Pos logit contributions
ern+0.071
ad+0.070
alia+0.067
 meanings+0.064
iles+0.063
Neg logit contributions
 divorce-0.050
 sudden-0.046
 million-0.046
burn-0.044
 helicopter-0.044
Token too
Token ID1165
Feature activation-7.87e-02

Pos logit contributions
 divorce+0.027
 million+0.026
burn+0.026
 sudden+0.025
 late+0.024
Neg logit contributions
ern-0.046
ad-0.046
alia-0.044
 meanings-0.043
iles-0.042
Token tightly
Token ID17707
Feature activation+4.35e-02

Pos logit contributions
ad+0.141
ern+0.139
alia+0.131
 meanings+0.127
iles+0.122
Neg logit contributions
 divorce-0.106
 million-0.094
 sudden-0.094
burn-0.093
 helicopter-0.090
Token.
Token ID13
Feature activation+1.15e-04

Pos logit contributions
 divorce+0.050
 million+0.048
burn+0.048
 late+0.047
 sudden+0.047
Neg logit contributions
ad-0.082
ern-0.081
alia-0.079
 meanings-0.076
eda-0.075
Token She
Token ID1375
Feature activation-5.84e-02

Pos logit contributions
 divorce+0.000
 million+0.000
 sudden+0.000
burn+0.000
 late+0.000
Neg logit contributions
ad-0.000
ern-0.000
alia-0.000
 meanings-0.000
iles-0.000
Token tried
Token ID3088
Feature activation+9.08e-02

Pos logit contributions
ern+0.114
ad+0.114
alia+0.107
iles+0.106
 meanings+0.105
Neg logit contributions
 divorce-0.064
 sudden-0.062
 million-0.060
 band-0.056
 sank-0.054
Token to
Token ID284
Feature activation-1.73e-02

Pos logit contributions
 divorce+0.105
 sudden+0.092
 million+0.092
burn+0.090
 helicopter+0.086
Neg logit contributions
ad-0.180
ern-0.178
alia-0.169
 meanings-0.165
iles-0.161
Token take
Token ID1011
Feature activation+2.54e-02

Pos logit contributions
alia+0.030
ad+0.030
ern+0.030
 meanings+0.029
 habitats+0.028
Neg logit contributions
 divorce-0.022
burn-0.021
 sudden-0.021
 band-0.021
 little-0.020
Token it
Token ID340
Feature activation+8.59e-02

Pos logit contributions
 divorce+0.034
 million+0.030
 sudden+0.030
burn+0.030
 helicopter+0.029
Neg logit contributions
ad-0.046
ern-0.045
alia-0.042
 meanings-0.041
iles-0.040
Token off
Token ID572
Feature activation+9.90e-02

Pos logit contributions
 divorce+0.108
 sudden+0.098
 million+0.098
burn+0.092
 helicopter+0.092
Neg logit contributions
ad-0.160
ern-0.158
alia-0.147
 meanings-0.140
et-0.137
Token,
Token ID11
Feature activation+2.71e-03

Pos logit contributions
 divorce+0.115
 million+0.105
 sudden+0.104
 late+0.100
 helicopter+0.100
Neg logit contributions
ern-0.191
ad-0.190
alia-0.182
 meanings-0.179
iles-0.173
Token
Token ID198
Feature activation+2.49e-02

Pos logit contributions
 divorce+0.033
burn+0.029
 million+0.029
 helicopter+0.028
 sudden+0.028
Neg logit contributions
ad-0.046
ern-0.046
alia-0.043
 meanings-0.040
iles-0.039
TokenWhile
Token ID3633
Feature activation+4.09e-02

Pos logit contributions
 divorce+0.034
 helicopter+0.029
burn+0.029
 million+0.028
 sank+0.026
Neg logit contributions
ern-0.046
ad-0.044
alia-0.042
 meanings-0.041
ori-0.039
Token they
Token ID484
Feature activation+8.94e-02

Pos logit contributions
 divorce+0.060
burn+0.054
broken+0.053
 helicopter+0.052
 million+0.051
Neg logit contributions
ern-0.062
ad-0.060
alia-0.059
iley-0.056
 Julia-0.054
Token were
Token ID547
Feature activation-1.81e-02

Pos logit contributions
 divorce+0.155
 helicopter+0.140
burn+0.136
 late+0.133
 million+0.132
Neg logit contributions
ad-0.133
ern-0.130
alia-0.117
 meanings-0.109
et-0.101
Token cooking
Token ID10801
Feature activation+1.76e-02

Pos logit contributions
ern+0.028
ad+0.026
alia+0.025
artments+0.023
et+0.023
Neg logit contributions
 divorce-0.028
 million-0.026
 helicopter-0.025
 sudden-0.025
burn-0.023
Token,
Token ID11
Feature activation-7.83e-03

Pos logit contributions
 divorce+0.021
 million+0.019
 sudden+0.019
 helicopter+0.018
burn+0.018
Neg logit contributions
ern-0.034
ad-0.034
alia-0.032
 meanings-0.031
 habitats-0.030
Token Lily
Token ID20037
Feature activation+3.50e-02

Pos logit contributions
ern+0.012
ad+0.012
alia+0.011
 Lucas+0.011
 Julia+0.011
Neg logit contributions
 divorce-0.011
burn-0.010
 million-0.010
broken-0.010
 severe-0.010
Token saw
Token ID2497
Feature activation-2.95e-02

Pos logit contributions
 divorce+0.054
 helicopter+0.047
burn+0.046
 million+0.046
apy+0.045
Neg logit contributions
alia-0.054
ad-0.054
ern-0.052
 meanings-0.048
et-0.048
Token her
Token ID607
Feature activation-4.34e-02

Pos logit contributions
ad+0.047
ern+0.044
alia+0.041
 meanings+0.040
iles+0.038
Neg logit contributions
 divorce-0.045
 million-0.043
burn-0.042
 sudden-0.041
 helicopter-0.040
Token mom
Token ID1995
Feature activation+3.51e-02

Pos logit contributions
ad+0.080
ern+0.079
alia+0.075
 meanings+0.075
iles+0.070
Neg logit contributions
 divorce-0.053
 million-0.049
 sudden-0.045
burn-0.045
 helicopter-0.044
Tokenmy
Token ID1820
Feature activation+2.56e-02

Pos logit contributions
 divorce+0.063
 million+0.056
 helicopter+0.056
 sudden+0.055
apy+0.054
Neg logit contributions
ad-0.049
ern-0.047
alia-0.043
et-0.040
 meanings-0.039
Token was
Token ID373
Feature activation+1.33e-02

Pos logit contributions
 divorce+0.030
 million+0.027
 sudden+0.027
burn+0.026
 helicopter+0.026
Neg logit contributions
ern-0.048
ad-0.047
alia-0.045
 meanings-0.043
iles-0.041
Token coming
Token ID2406
Feature activation+1.18e-01

Pos logit contributions
 divorce+0.017
 sudden+0.016
 million+0.016
burn+0.015
 late+0.015
Neg logit contributions
ern-0.025
ad-0.024
alia-0.023
 meanings-0.022
iles-0.021
Token in
Token ID287
Feature activation-8.15e-03

Pos logit contributions
 divorce+0.159
 sudden+0.142
 million+0.141
burn+0.137
 helicopter+0.132
Neg logit contributions
ad-0.211
ern-0.204
alia-0.190
 meanings-0.184
et-0.178
Token.
Token ID13
Feature activation-2.70e-03

Pos logit contributions
ern+0.016
alia+0.016
 meanings+0.015
ad+0.015
iles+0.015
Neg logit contributions
 divorce-0.009
 late-0.009
 million-0.009
 sudden-0.008
 helicopter-0.008
Token<|endoftext|>
Token ID50256
Feature activation+1.25e-04

Pos logit contributions
ad+0.005
ern+0.005
alia+0.004
 meanings+0.004
iles+0.004
Neg logit contributions
 divorce-0.004
 million-0.003
 sudden-0.003
burn-0.003
 helicopter-0.003
TokenOnce
Token ID7454
Feature activation-1.90e-02

Pos logit contributions
 divorce+0.000
 million+0.000
 sudden+0.000
burn+0.000
 helicopter+0.000
Neg logit contributions
ad-0.000
ern-0.000
alia-0.000
 meanings-0.000
iles-0.000
Token upon
Token ID2402
Feature activation-8.02e-02

Pos logit contributions
ad+0.026
ern+0.026
alia+0.023
 meanings+0.023
iles+0.021
Neg logit contributions
 divorce-0.033
 million-0.031
 sudden-0.031
burn-0.030
 helicopter-0.030
Token a
Token ID257
Feature activation+2.50e-02

Pos logit contributions
ad+0.130
ern+0.124
alia+0.113
 meanings+0.110
 facts+0.105
Neg logit contributions
 divorce-0.124
 million-0.113
burn-0.112
 sudden-0.109
 helicopter-0.108
Token time
Token ID640
Feature activation+3.23e-02

Pos logit contributions
 divorce+0.030
 sudden+0.027
 million+0.027
burn+0.025
 late+0.025
Neg logit contributions
ad-0.048
ern-0.048
alia-0.045
 meanings-0.044
iles-0.042
Token,
Token ID11
Feature activation-1.25e-02

Pos logit contributions
 divorce+0.041
 million+0.037
 sudden+0.036
burn+0.035
 helicopter+0.035
Neg logit contributions
ad-0.060
ern-0.060
alia-0.056
 meanings-0.055
iles-0.052
Token there
Token ID612
Feature activation+4.09e-02

Pos logit contributions
ad+0.024
ern+0.024
alia+0.021
iles+0.021
 meanings+0.021
Neg logit contributions
 divorce-0.015
burn-0.014
 million-0.013
 sudden-0.012
 helicopter-0.012
Token got
Token ID1392
Feature activation+5.78e-02

Pos logit contributions
 divorce+0.059
 million+0.052
burn+0.051
 sudden+0.050
 helicopter+0.049
Neg logit contributions
ad-0.061
ern-0.061
alia-0.057
 meanings-0.055
artments-0.053
Token in
Token ID287
Feature activation-9.00e-03

Pos logit contributions
 divorce+0.082
 million+0.077
 sudden+0.073
burn+0.072
 helicopter+0.069
Neg logit contributions
ad-0.097
ern-0.094
alia-0.087
 meanings-0.085
artments-0.083
Token the
Token ID262
Feature activation+5.82e-03

Pos logit contributions
ern+0.016
ad+0.016
alia+0.016
 meanings+0.016
iles+0.016
Neg logit contributions
 divorce-0.011
 million-0.010
 sudden-0.010
 late-0.010
 band-0.010
Token box
Token ID3091
Feature activation-1.68e-02

Pos logit contributions
 divorce+0.007
burn+0.007
broken+0.006
 sudden+0.006
 million+0.006
Neg logit contributions
ad-0.010
ern-0.010
et-0.009
 X-0.009
 seven-0.009
Token.
Token ID13
Feature activation-1.02e-02

Pos logit contributions
ern+0.031
ad+0.031
alia+0.030
 meanings+0.030
iles+0.029
Neg logit contributions
 divorce-0.019
 late-0.018
 band-0.018
 million-0.017
burn-0.017
Token "
Token ID366
Feature activation+5.36e-03

Pos logit contributions
ern+0.016
ad+0.015
alia+0.014
 meanings+0.014
artments+0.013
Neg logit contributions
 divorce-0.016
burn-0.015
 million-0.015
 sudden-0.014
 helicopter-0.014
TokenI
Token ID40
Feature activation+1.33e-02

Pos logit contributions
 divorce+0.007
 million+0.006
 sudden+0.006
burn+0.006
 helicopter+0.006
Neg logit contributions
ad-0.010
ern-0.010
alia-0.009
 meanings-0.009
iles-0.008
Token'm
Token ID1101
Feature activation+1.75e-02

Pos logit contributions
 divorce+0.018
 million+0.016
 sudden+0.016
burn+0.016
 helicopter+0.015
Neg logit contributions
ad-0.024
ern-0.024
alia-0.022
 meanings-0.022
iles-0.021
Token a
Token ID257
Feature activation+4.09e-02

Pos logit contributions
 divorce+0.027
 sudden+0.024
 million+0.024
burn+0.023
 helicopter+0.023
Neg logit contributions
ad-0.028
ern-0.028
alia-0.026
 meanings-0.025
artments-0.024
Token famous
Token ID5863
Feature activation-5.79e-03

Pos logit contributions
 divorce+0.058
burn+0.054
apy+0.053
broken+0.052
 accidents+0.052
Neg logit contributions
 seven-0.058
ad-0.056
 count-0.055
 X-0.054
et-0.052
Token racer
Token ID48233
Feature activation-2.35e-02

Pos logit contributions
ad+0.011
ern+0.011
alia+0.010
 meanings+0.010
iles+0.010
Neg logit contributions
 divorce-0.007
 million-0.006
 sudden-0.006
burn-0.006
 late-0.006
Token beautiful
Token ID4950
Feature activation-5.54e-02

Pos logit contributions
ern+0.034
ad+0.033
alia+0.032
 meanings+0.031
 habitats+0.030
Neg logit contributions
 divorce-0.019
 sudden-0.017
 late-0.017
 million-0.017
apy-0.016
Token.
Token ID13
Feature activation-3.07e-03

Pos logit contributions
ern+0.111
ad+0.106
 meanings+0.104
alia+0.102
eda+0.099
Neg logit contributions
 late-0.059
 divorce-0.059
 together-0.058
 helicopter-0.056
 million-0.056
Token<|endoftext|>
Token ID50256
Feature activation-7.80e-04

Pos logit contributions
ad+0.005
ern+0.005
alia+0.005
 meanings+0.005
iles+0.005
Neg logit contributions
 divorce-0.004
 million-0.004
 sudden-0.004
 late-0.004
burn-0.004
TokenMom
Token ID29252
Feature activation-3.52e-03

Pos logit contributions
ern+0.001
ad+0.001
 meanings+0.001
alia+0.001
artments+0.001
Neg logit contributions
 divorce-0.001
 million-0.001
 sudden-0.001
 late-0.001
burn-0.001
Token and
Token ID290
Feature activation+2.48e-02

Pos logit contributions
ern+0.007
ad+0.007
 meanings+0.007
 facts+0.007
 poses+0.006
Neg logit contributions
 million-0.003
burn-0.003
 divorce-0.003
 little-0.003
 band-0.003
Token Dad
Token ID17415
Feature activation-5.24e-02

Pos logit contributions
 divorce+0.040
 million+0.037
 sudden+0.037
burn+0.036
 helicopter+0.034
Neg logit contributions
ad-0.038
ern-0.036
alia-0.034
 meanings-0.032
dad-0.031
Token took
Token ID1718
Feature activation+4.26e-02

Pos logit contributions
ern+0.104
ad+0.100
 meanings+0.095
alia+0.095
 habitats+0.092
Neg logit contributions
 divorce-0.059
 million-0.054
burn-0.052
 sudden-0.051
 helicopter-0.050
Token Jack
Token ID3619
Feature activation+7.93e-03

Pos logit contributions
 divorce+0.062
 million+0.056
 sudden+0.056
burn+0.054
 helicopter+0.053
Neg logit contributions
ad-0.071
ern-0.071
alia-0.066
 meanings-0.064
iles-0.061
Token sailing
Token ID28499
Feature activation+1.43e-02

Pos logit contributions
 divorce+0.008
burn+0.007
 million+0.007
 late+0.007
 sudden+0.007
Neg logit contributions
ad-0.016
ern-0.016
alia-0.015
 meanings-0.015
iles-0.015
Token.
Token ID13
Feature activation-1.69e-02

Pos logit contributions
 together+0.013
 helicopter+0.013
 band+0.012
 late+0.012
 divorce+0.011
Neg logit contributions
ern-0.032
 meanings-0.031
ad-0.031
alia-0.030
eda-0.029
Token It
Token ID632
Feature activation-3.79e-03

Pos logit contributions
ad+0.032
ern+0.031
alia+0.029
 meanings+0.029
iles+0.028
Neg logit contributions
 divorce-0.021
 million-0.019
 sudden-0.019
 late-0.018
burn-0.018
Token.
Token ID13
Feature activation-1.81e-02

Pos logit contributions
ern+0.354
ad+0.346
alia+0.335
 meanings+0.317
eda+0.309
Neg logit contributions
 divorce-0.263
 million-0.243
 helicopter-0.240
burn-0.231
 late-0.230
Token She
Token ID1375
Feature activation-7.03e-02

Pos logit contributions
ad+0.043
ern+0.042
alia+0.040
 meanings+0.039
et+0.038
Neg logit contributions
 divorce-0.012
 sudden-0.011
 little-0.011
 million-0.010
 late-0.010
Token also
Token ID635
Feature activation-4.75e-02

Pos logit contributions
ern+0.147
ad+0.143
 meanings+0.133
ocol+0.132
ales+0.131
Neg logit contributions
 divorce-0.071
 accidentally-0.065
 sank-0.064
 sudden-0.064
 little-0.063
Token liked
Token ID8288
Feature activation-7.99e-02

Pos logit contributions
ern+0.097
ad+0.092
iles+0.085
ales+0.084
alia+0.084
Neg logit contributions
 divorce-0.054
 late-0.050
 little-0.048
 million-0.047
 destroyed-0.047
Token to
Token ID284
Feature activation-3.52e-02

Pos logit contributions
ern+0.164
ad+0.162
alia+0.154
 meanings+0.148
io+0.142
Neg logit contributions
 divorce-0.087
 sudden-0.076
 million-0.075
 helicopter-0.073
burn-0.073
Token count
Token ID954
Feature activation-1.55e-01

Pos logit contributions
alia+0.076
ern+0.074
iley+0.073
io+0.070
ori+0.070
Neg logit contributions
 little-0.031
 band-0.030
 divorce-0.030
 helicopter-0.029
 fire-0.028
Token how
Token ID703
Feature activation-4.09e-02

Pos logit contributions
ern+0.345
alia+0.327
ad+0.320
dad+0.303
iley+0.302
Neg logit contributions
 helicopter-0.148
 together-0.139
 they-0.134
 little-0.134
 divorce-0.129
Token many
Token ID867
Feature activation-3.59e-02

Pos logit contributions
ad+0.080
ern+0.079
alia+0.075
 meanings+0.072
iles+0.070
Neg logit contributions
 divorce-0.048
 million-0.042
 sudden-0.042
burn-0.041
 helicopter-0.040
Token animals
Token ID4695
Feature activation-7.90e-02

Pos logit contributions
ern+0.066
ad+0.063
alia+0.061
artments+0.057
dad+0.056
Neg logit contributions
 sudden-0.044
 little-0.044
 divorce-0.044
 helicopter-0.044
 million-0.044
Token there
Token ID612
Feature activation+4.17e-02

Pos logit contributions
ern+0.158
ad+0.149
alia+0.146
 meanings+0.140
artments+0.135
Neg logit contributions
 divorce-0.086
 helicopter-0.080
 they-0.078
 together-0.078
 band-0.077
Token were
Token ID547
Feature activation-2.77e-02

Pos logit contributions
 they+0.028
 divorce+0.027
 we+0.026
 together+0.026
 band+0.024
Neg logit contributions
ern-0.100
alia-0.095
ad-0.093
artments-0.093
 meanings-0.091
Token,
Token ID11
Feature activation+1.01e-02

Pos logit contributions
ad+0.082
ern+0.081
alia+0.076
 meanings+0.073
io+0.070
Neg logit contributions
 divorce-0.065
 sudden-0.059
 million-0.059
burn-0.058
 helicopter-0.055
Token "
Token ID366
Feature activation+1.48e-02

Pos logit contributions
 divorce+0.014
 million+0.013
 sudden+0.013
burn+0.013
 helicopter+0.012
Neg logit contributions
ad-0.017
ern-0.017
alia-0.016
 meanings-0.016
iles-0.015
TokenThat
Token ID2504
Feature activation-5.49e-02

Pos logit contributions
 divorce+0.019
 sudden+0.018
 million+0.017
burn+0.017
 late+0.017
Neg logit contributions
ern-0.027
ad-0.027
alia-0.025
 meanings-0.025
iles-0.024
Token's
Token ID338
Feature activation+3.96e-02

Pos logit contributions
ad+0.106
ern+0.105
alia+0.100
 meanings+0.097
iles+0.093
Neg logit contributions
 divorce-0.066
 million-0.058
 sudden-0.058
burn-0.056
 helicopter-0.054
Token great
Token ID1049
Feature activation-6.42e-04

Pos logit contributions
 late+0.025
 divorce+0.024
 better+0.023
 million+0.021
 little+0.021
Neg logit contributions
ad-0.096
alia-0.096
ern-0.096
 habitats-0.093
io-0.092
Token!
Token ID0
Feature activation-3.42e-02

Pos logit contributions
ern+0.001
ad+0.001
alia+0.001
 meanings+0.001
 habitats+0.001
Neg logit contributions
 divorce-0.001
 million-0.001
 band-0.001
 sudden-0.001
burn-0.001
Token Be
Token ID1355
Feature activation-7.68e-04

Pos logit contributions
ad+0.060
ern+0.060
alia+0.056
 meanings+0.054
iles+0.051
Neg logit contributions
 divorce-0.047
 sudden-0.042
 million-0.042
burn-0.041
 helicopter-0.040
Token careful
Token ID8161
Feature activation-5.05e-03

Pos logit contributions
ern+0.002
ad+0.002
alia+0.001
 meanings+0.001
 habitats+0.001
Neg logit contributions
 divorce-0.001
 sudden-0.001
 million-0.001
 late-0.001
burn-0.001
Token next
Token ID1306
Feature activation-3.36e-02

Pos logit contributions
ern+0.009
ad+0.009
alia+0.009
 meanings+0.009
 habitats+0.008
Neg logit contributions
 divorce-0.006
burn-0.006
 million-0.006
 sudden-0.005
 late-0.005
Token time
Token ID640
Feature activation+4.64e-02

Pos logit contributions
ad+0.060
ern+0.059
alia+0.055
 meanings+0.053
iles+0.051
Neg logit contributions
 divorce-0.046
 million-0.041
 sudden-0.041
burn-0.040
 helicopter-0.040
Token."
Token ID526
Feature activation+2.63e-02

Pos logit contributions
 divorce+0.056
 million+0.051
 sudden+0.050
 late+0.050
burn+0.049
Neg logit contributions
ad-0.088
ern-0.088
alia-0.082
 meanings-0.080
iles-0.077
Token surprised
Token ID6655
Feature activation+5.83e-02

Pos logit contributions
ern+0.020
ad+0.020
alia+0.019
 meanings+0.018
 habitats+0.018
Neg logit contributions
 divorce-0.008
 sudden-0.008
 million-0.008
 late-0.008
 helicopter-0.007
Token and
Token ID290
Feature activation+3.70e-02

Pos logit contributions
 divorce+0.054
 sudden+0.051
 late+0.049
 million+0.048
 accidents+0.048
Neg logit contributions
ern-0.129
ad-0.120
alia-0.117
 meanings-0.116
artments-0.112
Token happy
Token ID3772
Feature activation+6.23e-02

Pos logit contributions
 divorce+0.037
 sudden+0.033
 million+0.033
 late+0.031
 helicopter+0.030
Neg logit contributions
ern-0.079
ad-0.079
alia-0.074
 meanings-0.072
iles-0.070
Token for
Token ID329
Feature activation-1.87e-01

Pos logit contributions
 divorce+0.070
 sudden+0.066
 million+0.064
 late+0.063
 together+0.062
Neg logit contributions
ern-0.127
ad-0.119
alia-0.118
eda-0.110
 meanings-0.110
Token the
Token ID262
Feature activation+1.27e-02

Pos logit contributions
ern+0.343
alia+0.336
iddle+0.318
 meanings+0.308
 habitats+0.306
Neg logit contributions
 together-0.232
 divorce-0.220
 they-0.217
 little-0.213
 sudden-0.210
Token turtle
Token ID28699
Feature activation-7.55e-02

Pos logit contributions
 divorce+0.016
 sudden+0.014
 million+0.014
 helicopter+0.014
burn+0.013
Neg logit contributions
ern-0.024
ad-0.024
alia-0.023
 meanings-0.022
iles-0.021
Token.
Token ID13
Feature activation-4.77e-03

Pos logit contributions
ern+0.144
ad+0.140
alia+0.134
 meanings+0.131
iles+0.129
Neg logit contributions
 divorce-0.088
 million-0.081
 helicopter-0.079
 sudden-0.077
 band-0.077
Token They
Token ID1119
Feature activation+1.31e-02

Pos logit contributions
ad+0.010
ern+0.010
alia+0.009
 meanings+0.009
iles+0.009
Neg logit contributions
 divorce-0.005
 million-0.005
 sudden-0.005
 late-0.004
 helicopter-0.004
Token cheered
Token ID34767
Feature activation+8.31e-02

Pos logit contributions
 divorce+0.014
 million+0.013
 sudden+0.013
 band+0.013
 helicopter+0.013
Neg logit contributions
ern-0.026
ad-0.025
alia-0.024
 meanings-0.024
iles-0.024
Token and
Token ID290
Feature activation+4.16e-02

Pos logit contributions
 divorce+0.102
 late+0.092
 million+0.091
 sudden+0.090
 helicopter+0.090
Neg logit contributions
ern-0.162
ad-0.154
alia-0.150
 meanings-0.142
iles-0.138
Token cl
Token ID537
Feature activation+3.16e-02

Pos logit contributions
 divorce+0.055
 million+0.049
 sudden+0.048
burn+0.047
 helicopter+0.046
Neg logit contributions
ad-0.076
ern-0.075
alia-0.071
 meanings-0.069
iles-0.065
Token support
Token ID1104
Feature activation+3.21e-02

Pos logit contributions
 divorce+0.058
 million+0.053
 sudden+0.053
burn+0.052
 helicopter+0.051
Neg logit contributions
ad-0.065
ern-0.064
alia-0.060
 meanings-0.058
iles-0.056
Token.
Token ID13
Feature activation+2.47e-03

Pos logit contributions
 divorce+0.035
 band+0.032
 late+0.032
 million+0.030
 together+0.030
Neg logit contributions
ern-0.063
iles-0.060
alia-0.059
ad-0.059
 meanings-0.059
Token She
Token ID1375
Feature activation-4.99e-02

Pos logit contributions
 divorce+0.004
 million+0.003
 sudden+0.003
burn+0.003
 helicopter+0.003
Neg logit contributions
ad-0.004
ern-0.004
alia-0.004
 meanings-0.004
iles-0.003
Token knew
Token ID2993
Feature activation+2.07e-02

Pos logit contributions
ad+0.092
ern+0.092
alia+0.087
 meanings+0.084
iles+0.082
Neg logit contributions
 divorce-0.063
 million-0.057
 sudden-0.056
burn-0.054
 helicopter-0.053
Token that
Token ID326
Feature activation-4.84e-02

Pos logit contributions
 divorce+0.024
 helicopter+0.021
 late+0.021
 sudden+0.021
 band+0.020
Neg logit contributions
ern-0.040
ad-0.039
alia-0.039
iles-0.037
 meanings-0.037
Token with
Token ID351
Feature activation-5.21e-02

Pos logit contributions
ern+0.091
ad+0.090
alia+0.086
 meanings+0.082
iles+0.081
Neg logit contributions
 divorce-0.061
 million-0.055
 sudden-0.054
burn-0.053
 late-0.052
Token her
Token ID607
Feature activation-4.03e-02

Pos logit contributions
ern+0.102
alia+0.096
ad+0.092
io+0.092
iles+0.090
Neg logit contributions
 divorce-0.063
 band-0.057
 late-0.056
 little-0.056
 sudden-0.055
Token Mom
Token ID11254
Feature activation+2.26e-02

Pos logit contributions
ad+0.072
ern+0.071
alia+0.067
 meanings+0.065
iles+0.062
Neg logit contributions
 divorce-0.054
 million-0.049
 sudden-0.049
burn-0.047
 helicopter-0.047
Token's
Token ID338
Feature activation+4.11e-02

Pos logit contributions
 divorce+0.024
 million+0.022
burn+0.021
 helicopter+0.021
 late+0.020
Neg logit contributions
ern-0.047
ad-0.045
alia-0.043
iles-0.042
 meanings-0.042
Token support
Token ID1104
Feature activation+3.40e-02

Pos logit contributions
 divorce+0.061
 sudden+0.055
 million+0.055
 helicopter+0.053
burn+0.053
Neg logit contributions
ern-0.068
ad-0.067
alia-0.064
 meanings-0.060
iles-0.059
Token,
Token ID11
Feature activation+4.95e-03

Pos logit contributions
 divorce+0.038
 million+0.033
 band+0.032
 late+0.032
burn+0.032
Neg logit contributions
ern-0.069
ad-0.066
alia-0.064
 meanings-0.061
iles-0.061
Token The
Token ID383
Feature activation+3.04e-02

Pos logit contributions
 divorce+0.007
 sudden+0.007
burn+0.007
 million+0.007
 helicopter+0.007
Neg logit contributions
ad-0.007
ern-0.006
alia-0.006
 meanings-0.006
 Lucas-0.005
Token thread
Token ID4704
Feature activation+2.25e-02

Pos logit contributions
 divorce+0.034
 million+0.030
 sudden+0.030
burn+0.029
 helicopter+0.028
Neg logit contributions
ad-0.061
ern-0.060
alia-0.057
 meanings-0.055
iles-0.053
Token is
Token ID318
Feature activation-5.87e-02

Pos logit contributions
 divorce+0.031
 sudden+0.028
 million+0.027
burn+0.027
 late+0.025
Neg logit contributions
ad-0.040
ern-0.038
alia-0.036
 meanings-0.036
iles-0.035
Token too
Token ID1165
Feature activation-7.44e-02

Pos logit contributions
ad+0.118
ern+0.117
alia+0.111
 meanings+0.107
iles+0.104
Neg logit contributions
 divorce-0.066
 million-0.058
 sudden-0.058
burn-0.056
 helicopter-0.055
Token thin
Token ID7888
Feature activation-5.09e-02

Pos logit contributions
ad+0.109
ern+0.102
 meanings+0.100
iles+0.100
alia+0.099
Neg logit contributions
 divorce-0.118
 helicopter-0.103
 sudden-0.102
burn-0.102
 million-0.100
Token for
Token ID329
Feature activation-1.82e-01

Pos logit contributions
ern+0.094
ad+0.094
alia+0.089
 meanings+0.086
eda+0.084
Neg logit contributions
 divorce-0.063
 late-0.057
 million-0.057
burn-0.056
 helicopter-0.054
Token the
Token ID262
Feature activation+1.48e-02

Pos logit contributions
ad+0.316
ern+0.313
alia+0.292
 meanings+0.284
iles+0.273
Neg logit contributions
 divorce-0.253
 million-0.229
 sudden-0.227
burn-0.223
 helicopter-0.217
Token button
Token ID4936
Feature activation+6.02e-02

Pos logit contributions
 divorce+0.017
 million+0.015
 sudden+0.015
burn+0.015
 helicopter+0.014
Neg logit contributions
ad-0.029
ern-0.029
alia-0.027
 meanings-0.027
iles-0.026
Token.
Token ID13
Feature activation+4.16e-03

Pos logit contributions
 divorce+0.093
 million+0.085
 sudden+0.085
burn+0.083
 helicopter+0.082
Neg logit contributions
ad-0.094
ern-0.093
alia-0.087
 meanings-0.084
iles-0.080
Token The
Token ID383
Feature activation+2.98e-02

Pos logit contributions
 divorce+0.007
burn+0.007
 sudden+0.007
 million+0.006
 helicopter+0.006
Neg logit contributions
ad-0.006
ern-0.006
alia-0.005
 meanings-0.005
 Lucas-0.005
Token thread
Token ID4704
Feature activation+2.23e-02

Pos logit contributions
 divorce+0.035
 million+0.031
 sudden+0.031
burn+0.030
 late+0.029
Neg logit contributions
ad-0.058
ern-0.057
alia-0.053
 meanings-0.053
iles-0.050
Token Jane
Token ID12091
Feature activation-2.45e-02

Pos logit contributions
ad+0.032
ern+0.031
alia+0.030
iles+0.029
 meanings+0.028
Neg logit contributions
 divorce-0.023
 million-0.021
 sudden-0.021
 late-0.021
 little-0.019
Token was
Token ID373
Feature activation+4.68e-03

Pos logit contributions
ad+0.041
ern+0.040
alia+0.038
et+0.036
 meanings+0.036
Neg logit contributions
 divorce-0.036
 million-0.032
burn-0.032
 late-0.031
 sudden-0.031
Token so
Token ID523
Feature activation-2.33e-02

Pos logit contributions
 divorce+0.006
 million+0.005
 sudden+0.005
burn+0.005
 helicopter+0.005
Neg logit contributions
ad-0.009
ern-0.008
alia-0.008
 meanings-0.007
et-0.007
Token proud
Token ID6613
Feature activation-1.90e-04

Pos logit contributions
ad+0.039
ern+0.036
alia+0.035
eka+0.033
 Lucas+0.033
Neg logit contributions
 divorce-0.035
burn-0.031
 helicopter-0.030
 million-0.029
bul-0.029
Token of
Token ID286
Feature activation+5.76e-02

Pos logit contributions
ern+0.000
ad+0.000
alia+0.000
 meanings+0.000
artments+0.000
Neg logit contributions
 divorce-0.000
 sudden-0.000
 million-0.000
 late-0.000
 little-0.000
Token herself
Token ID5223
Feature activation-1.76e-01

Pos logit contributions
 divorce+0.075
 they+0.075
 accidents+0.073
 together+0.073
 we+0.070
Neg logit contributions
alia-0.099
ern-0.098
 meanings-0.092
 habitats-0.090
ad-0.088
Token for
Token ID329
Feature activation-1.91e-01

Pos logit contributions
ad+0.278
ern+0.275
alia+0.257
 meanings+0.249
iles+0.237
Neg logit contributions
 divorce-0.270
 million-0.248
 sudden-0.245
burn-0.241
 helicopter-0.239
Token being
Token ID852
Feature activation-5.75e-02

Pos logit contributions
ern+0.322
ad+0.320
alia+0.319
 meanings+0.300
iles+0.290
Neg logit contributions
 divorce-0.268
 sudden-0.251
 band-0.238
burn-0.237
 million-0.236
Token successful
Token ID4388
Feature activation+2.56e-02

Pos logit contributions
ad+0.102
ern+0.099
alia+0.094
et+0.088
 meanings+0.086
Neg logit contributions
 divorce-0.079
 million-0.071
 sudden-0.070
burn-0.068
 helicopter-0.067
Token in
Token ID287
Feature activation-1.48e-02

Pos logit contributions
 late+0.025
 divorce+0.025
 together+0.025
 little+0.023
 band+0.023
Neg logit contributions
ern-0.052
ad-0.051
wered-0.048
eda-0.047
alia-0.047
Token steering
Token ID19702
Feature activation-1.65e-02

Pos logit contributions
ern+0.031
alia+0.031
 meanings+0.030
ales+0.030
iles+0.030
Neg logit contributions
 late-0.013
 divorce-0.013
 band-0.012
 sudden-0.012
 million-0.012
Token.
Token ID13
Feature activation-2.98e-02

Pos logit contributions
 divorce+0.058
 sudden+0.053
 million+0.052
 helicopter+0.050
burn+0.050
Neg logit contributions
ad-0.074
ern-0.074
alia-0.069
 meanings-0.066
iles-0.064
Token Can
Token ID1680
Feature activation+1.38e-02

Pos logit contributions
ern+0.048
ad+0.047
alia+0.043
 meanings+0.043
artments+0.040
Neg logit contributions
 divorce-0.045
 sudden-0.042
 million-0.041
burn-0.040
 helicopter-0.039
Token you
Token ID345
Feature activation-3.65e-02

Pos logit contributions
 divorce+0.017
 million+0.015
 sudden+0.015
burn+0.015
 helicopter+0.015
Neg logit contributions
ad-0.026
ern-0.026
alia-0.024
 meanings-0.023
iles-0.022
Token do
Token ID466
Feature activation-1.18e-01

Pos logit contributions
ern+0.065
ad+0.064
alia+0.063
 meanings+0.062
 habitats+0.061
Neg logit contributions
 divorce-0.048
burn-0.042
 band-0.042
 sudden-0.042
 million-0.041
Token that
Token ID326
Feature activation-6.36e-02

Pos logit contributions
ad+0.233
ern+0.233
alia+0.230
 habitats+0.217
 meanings+0.216
Neg logit contributions
 divorce-0.136
 helicopter-0.120
 sudden-0.119
 band-0.119
 million-0.117
Token for
Token ID329
Feature activation-1.97e-01

Pos logit contributions
ad+0.069
et+0.062
ern+0.061
 count+0.058
alia+0.057
Neg logit contributions
 divorce-0.116
 sudden-0.114
 million-0.107
apy-0.106
 sank-0.106
Token me
Token ID502
Feature activation-1.34e-01

Pos logit contributions
ern+0.270
ad+0.270
alia+0.257
 meanings+0.243
iles+0.232
Neg logit contributions
 divorce-0.342
 sudden-0.321
 million-0.317
burn-0.311
 helicopter-0.305
Token?"
Token ID1701
Feature activation-4.56e-02

Pos logit contributions
ad+0.250
ern+0.248
alia+0.236
 meanings+0.233
iles+0.221
Neg logit contributions
 divorce-0.167
burn-0.148
 million-0.146
 sudden-0.145
 late-0.143
Token
Token ID198
Feature activation+1.86e-02

Pos logit contributions
ern+0.080
ad+0.078
alia+0.070
 meanings+0.069
 Sof+0.064
Neg logit contributions
 divorce-0.064
burn-0.059
 million-0.057
 helicopter-0.055
 sudden-0.054
Token
Token ID198
Feature activation+1.91e-02

Pos logit contributions
 divorce+0.028
burn+0.026
 million+0.024
 helicopter+0.024
 sank+0.024
Neg logit contributions
ern-0.030
ad-0.030
alia-0.026
 meanings-0.025
dad-0.024
TokenLu
Token ID25596
Feature activation+1.57e-02

Pos logit contributions
 divorce+0.027
burn+0.024
 million+0.023
 sudden+0.022
 helicopter+0.022
Neg logit contributions
ern-0.035
ad-0.033
alia-0.030
 meanings-0.030
artments-0.028
Token let
Token ID1309
Feature activation+7.48e-02

Pos logit contributions
ern+0.141
ad+0.139
alia+0.134
 meanings+0.128
 habitats+0.125
Neg logit contributions
 divorce-0.072
 million-0.063
 sudden-0.063
 helicopter-0.060
burn-0.059
Token's
Token ID338
Feature activation+2.86e-02

Pos logit contributions
 divorce+0.171
 sudden+0.163
 million+0.161
 late+0.158
apy+0.158
Neg logit contributions
ad-0.069
ern-0.055
et-0.051
alia-0.050
eda-0.044
Token find
Token ID1064
Feature activation-7.72e-02

Pos logit contributions
 divorce+0.045
 million+0.042
 sudden+0.041
burn+0.040
 late+0.039
Neg logit contributions
ad-0.045
ern-0.043
alia-0.040
 meanings-0.039
iles-0.037
Token something
Token ID1223
Feature activation-2.01e-02

Pos logit contributions
ad+0.129
ern+0.116
et+0.107
alia+0.105
 meanings+0.104
Neg logit contributions
 divorce-0.117
 million-0.108
burn-0.108
 sudden-0.105
broken-0.101
Token else
Token ID2073
Feature activation+1.44e-02

Pos logit contributions
ern+0.038
ad+0.038
alia+0.035
 meanings+0.035
iles+0.034
Neg logit contributions
 divorce-0.023
burn-0.021
 million-0.021
 sudden-0.021
 helicopter-0.020
Token for
Token ID329
Feature activation-1.93e-01

Pos logit contributions
 divorce+0.017
 million+0.015
 sudden+0.015
burn+0.014
 helicopter+0.014
Neg logit contributions
ad-0.028
ern-0.028
alia-0.026
 meanings-0.026
iles-0.025
Token me
Token ID502
Feature activation-1.30e-01

Pos logit contributions
ern+0.334
ad+0.331
alia+0.316
 meanings+0.304
 habitats+0.289
Neg logit contributions
 divorce-0.268
 sudden-0.244
 million-0.243
burn-0.234
 late-0.232
Token to
Token ID284
Feature activation-3.11e-02

Pos logit contributions
ad+0.248
ern+0.247
alia+0.232
 meanings+0.226
iles+0.218
Neg logit contributions
 divorce-0.157
 million-0.140
 sudden-0.139
burn-0.135
 helicopter-0.132
Token pinch
Token ID25394
Feature activation+1.34e-02

Pos logit contributions
ern+0.059
alia+0.057
ad+0.055
 meanings+0.054
io+0.053
Neg logit contributions
 divorce-0.039
 band-0.035
 sudden-0.034
 helicopter-0.033
 million-0.033
Token."
Token ID526
Feature activation+7.12e-03

Pos logit contributions
 divorce+0.016
 million+0.015
 sudden+0.015
 late+0.014
burn+0.014
Neg logit contributions
ern-0.026
ad-0.025
alia-0.024
 meanings-0.023
 habitats-0.022
Token
Token ID198
Feature activation+2.48e-02

Pos logit contributions
 divorce+0.010
 million+0.009
burn+0.009
 helicopter+0.009
 sudden+0.009
Neg logit contributions
ad-0.013
ern-0.013
alia-0.012
 meanings-0.011
io-0.011
Token.
Token ID13
Feature activation-7.53e-03

Pos logit contributions
ern+0.268
ad+0.265
alia+0.244
 meanings+0.243
 facts+0.230
Neg logit contributions
 divorce-0.195
 million-0.177
burn-0.177
 sudden-0.176
 helicopter-0.175
Token They
Token ID1119
Feature activation+1.44e-02

Pos logit contributions
ad+0.014
ern+0.014
alia+0.013
 meanings+0.013
et+0.013
Neg logit contributions
 divorce-0.009
 million-0.008
 sudden-0.008
 late-0.008
 helicopter-0.008
Token wanted
Token ID2227
Feature activation-6.03e-03

Pos logit contributions
 they+0.012
 destroyed+0.011
 divorce+0.011
 helicopter+0.011
 little+0.011
Neg logit contributions
ern-0.032
ad-0.029
alia-0.029
 meanings-0.028
artments-0.028
Token something
Token ID1223
Feature activation-1.63e-02

Pos logit contributions
ern+0.014
ad+0.014
alia+0.014
artments+0.013
 meanings+0.013
Neg logit contributions
 divorce-0.005
 helicopter-0.004
 late-0.004
 sudden-0.004
 band-0.004
Token simple
Token ID2829
Feature activation-5.81e-02

Pos logit contributions
ad+0.037
ern+0.036
alia+0.036
 habitats+0.035
 meanings+0.034
Neg logit contributions
 divorce-0.016
 better-0.014
 together-0.014
 they-0.013
 band-0.013
Token for
Token ID329
Feature activation-1.91e-01

Pos logit contributions
ern+0.127
ad+0.125
alia+0.123
eda+0.116
wered+0.116
Neg logit contributions
 together-0.053
 divorce-0.050
 they-0.046
 we-0.046
 better-0.045
Token their
Token ID511
Feature activation+2.37e-02

Pos logit contributions
ern+0.319
alia+0.318
ad+0.288
iddle+0.284
 meanings+0.284
Neg logit contributions
 divorce-0.258
 they-0.256
 together-0.250
 helicopter-0.248
 little-0.245
Token new
Token ID649
Feature activation+1.46e-02

Pos logit contributions
 divorce+0.032
 sudden+0.029
 helicopter+0.028
 million+0.027
 band+0.027
Neg logit contributions
ern-0.044
ad-0.043
alia-0.041
 meanings-0.037
io-0.037
Token pup
Token ID15552
Feature activation-6.27e-03

Pos logit contributions
 divorce+0.019
 helicopter+0.018
 sudden+0.017
 million+0.017
 band+0.017
Neg logit contributions
ern-0.027
ad-0.027
alia-0.025
 meanings-0.024
io-0.023
Token.
Token ID13
Feature activation-1.12e-02

Pos logit contributions
ern+0.012
ad+0.012
alia+0.011
 meanings+0.010
iles+0.010
Neg logit contributions
 divorce-0.008
 million-0.007
 helicopter-0.007
burn-0.007
 late-0.007
Token Dad
Token ID17415
Feature activation-4.74e-02

Pos logit contributions
ad+0.023
ern+0.023
alia+0.022
 meanings+0.021
et+0.021
Neg logit contributions
 divorce-0.012
 million-0.011
 sudden-0.011
 helicopter-0.010
 late-0.010