TOP ACTIVATIONS
MAX = 0.117

Jack and Emma â s mom gave them cel
, and he â d sne aked it into
, with Mary â s heart full of the
with lots of sand and palm trees . There was a
became shall ower and shall ower until the bear was able
all of Leo â s lion friends reached the
. With Joe â s help , Steve started
were in Tommy â s bedroom . Tommy was
green and blue land with palm trees and flowers . "
. As Daisy â s gaze passed by ,
, and Pete â s heart raced as he
and the boy â s cane looked very similar
Mom . You â re very powerful ! â
chair and we â ll finish the rest of
for Strip es â family . Spot finally found
favourite . Toby â s cream won ! The
restless and couldn â t get to sleep .
. On Mother â s Day , they presented
He has black hair and black eyes . He 's very
, and I â ll show you around .

MIN ACTIVATIONS
MIN = -0.101

You are a very good writer . Your uncle will be
Larry was a good - nat ured bird and always looked
there lived a princess named Flor a . She had long
. You are a great writer ." his mom says .
. You are a good writer . But you can write
was a little girl named Flor a . Flor a was
! You are a great writer !" she says .
" You are a gifted writer , Lily . You write
. You are a wonderful writer . You are a wonderful
you 're such a good writer ! I wish I could
came home she asked if Flor a had washed her toys
was a little girl named Beat rice . She was so
! You 're a good writer ." Lily felt happy and
And you are a good writer . But next time ,
, I am a brave astronaut !" Ben said . He
Mum was very pleased when Flor a showed her the clean
, can I be an astronaut and fly to space someday
time of becoming a brave astronaut , flying through the stars
Rex would be a good lear ner and look er too
. It then lowered an astronaut who waved to Sarah !

INTERVAL 0.063 - 0.117
CONTAINS 0.125%

The bears used their strong arms to clear away the broken
. It was small and black and had a loud voice
is : Don â t stir things up if
church was big and had thick walls that kept it warm
the popcorn . They spread thick blankets on the floor and

INTERVAL 0.008 - 0.063
CONTAINS 31.715%

a very special gift . Let 's keep it safe .
asked , " Mom my , why aren 't the stars
Mom said it was time to have a snack . She
between them . Jane smiled , as it was the y
will score ." The little bird practiced every day , and

INTERVAL -0.046 - 0.008
CONTAINS 67.408%

. L ily and Tim my were happy
a whistle . It can go fast and slow . Do
. Then John never woke up again . <|endoftext|> Once upon
be the best , so he went on stage . He
After wards , Daddy hugged his little girl and

INTERVAL -0.101 - -0.046
CONTAINS 0.752%

. We can come and visit it every day ."
and Max was the h ider . Lily counted to ten
Jimmy . He loved to visit the stadium and watch the
the sky !" With her parents , Lucy travelled
surprise for you . Come with me ." The
Token Jack
Token ID3619
Feature activation-6.77e-03

Pos logit contributions
 unnecessary+0.033
apy+0.032
 happened+0.031
 risks+0.031
OO+0.030
Neg logit contributions
 next-0.033
 Claus-0.030
 Amen-0.030
Santa-0.029
 timer-0.029
Token and
Token ID290
Feature activation+1.19e-02

Pos logit contributions
 Amen+0.010
 next+0.009
ider+0.009
 timer+0.009
Cl+0.008
Neg logit contributions
op-0.009
itten-0.009
OO-0.009
 happened-0.009
 tal-0.009
Token Emma
Token ID18966
Feature activation+1.25e-02

Pos logit contributions
 disease+0.016
 Everything+0.016
 cob+0.016
 mattered+0.016
 happened+0.015
Neg logit contributions
 next-0.015
 timer-0.015
ider-0.014
Santa-0.014
 Claus-0.014
Tokenâ
Token ID22940
Feature activation+1.49e-02

Pos logit contributions
op+0.015
're+0.015
itten+0.015
OO+0.014
ops+0.014
Neg logit contributions
 next-0.021
 Amen-0.018
 timer-0.017
 Avery-0.017
oe-0.017
Token
Token ID26391
Feature activation+4.23e-02

Pos logit contributions
itten+0.020
 unnecessary+0.019
OO+0.019
 disease+0.018
op+0.018
Neg logit contributions
 next-0.024
 Avery-0.021
 timer-0.020
 Claus-0.020
 Anna-0.020
Token
Token ID8151
Feature activation+1.17e-01

Pos logit contributions
itten+0.053
 unnecessary+0.052
 disease+0.050
OO+0.049
 risks+0.048
Neg logit contributions
 next-0.073
 Avery-0.062
 Claus-0.062
 Anna-0.061
oe-0.060
Tokens
Token ID82
Feature activation+4.51e-02

Pos logit contributions
 unnecessary+0.200
 disease+0.191
 risks+0.191
itten+0.190
OO+0.186
Neg logit contributions
 next-0.153
oe-0.122
 Avery-0.120
 Anna-0.119
 Claus-0.114
Token mom
Token ID1995
Feature activation+1.85e-02

Pos logit contributions
 muscles+0.064
op+0.064
 fault+0.064
itten+0.062
 cob+0.062
Neg logit contributions
 next-0.064
 timer-0.057
Santa-0.057
Cl-0.057
May-0.056
Token gave
Token ID2921
Feature activation-3.07e-02

Pos logit contributions
op+0.024
itten+0.024
're+0.023
 weren+0.022
ilst+0.021
Neg logit contributions
 next-0.028
 Amen-0.028
Cl-0.027
 Avery-0.026
ider-0.026
Token them
Token ID606
Feature activation+2.48e-02

Pos logit contributions
 next+0.044
 timer+0.039
 Claus+0.038
Santa+0.038
 Avery+0.037
Neg logit contributions
 unnecessary-0.043
itten-0.043
op-0.041
OO-0.041
 disease-0.040
Token cel
Token ID18725
Feature activation+2.02e-02

Pos logit contributions
itten+0.031
 unnecessary+0.030
 fault+0.029
op+0.029
 risks+0.029
Neg logit contributions
 next-0.037
 timer-0.036
 Amen-0.036
 Claus-0.035
 Avery-0.034
Token,
Token ID11
Feature activation+8.15e-03

Pos logit contributions
itten+0.021
op+0.020
 tal+0.017
ilst+0.017
OO+0.017
Neg logit contributions
 Amen-0.026
 timer-0.024
 waiter-0.023
Santa-0.023
 menus-0.022
Token and
Token ID290
Feature activation+1.15e-02

Pos logit contributions
itten+0.011
 cob+0.011
 spike+0.011
 weren+0.010
 tal+0.010
Neg logit contributions
 timer-0.010
 next-0.010
 Amen-0.010
ider-0.010
oe-0.010
Token he
Token ID339
Feature activation-2.62e-03

Pos logit contributions
itten+0.015
 tal+0.014
 unnecessary+0.014
apy+0.014
 cob+0.014
Neg logit contributions
 next-0.017
 timer-0.016
 Claus-0.015
oe-0.015
Santa-0.014
Tokenâ
Token ID22940
Feature activation+1.17e-03

Pos logit contributions
 next+0.004
 timer+0.004
Santa+0.004
 Amen+0.004
Cl+0.004
Neg logit contributions
 weren-0.003
 happened-0.003
 pierced-0.002
 shouldn-0.002
op-0.002
Token
Token ID26391
Feature activation+3.06e-02

Pos logit contributions
itten+0.001
OO+0.001
age+0.001
inter+0.001
uch+0.001
Neg logit contributions
 next-0.002
 Avery-0.002
 waiter-0.002
 timer-0.002
 Anna-0.002
Token
Token ID8151
Feature activation+1.15e-01

Pos logit contributions
itten+0.037
op+0.037
OO+0.036
 fault+0.034
 unnecessary+0.034
Neg logit contributions
 next-0.053
 timer-0.047
 Avery-0.046
 waiter-0.044
 kindly-0.043
Tokend
Token ID67
Feature activation+7.70e-03

Pos logit contributions
itten+0.171
op+0.163
OO+0.163
 fault+0.158
 unnecessary+0.155
Neg logit contributions
 next-0.169
 Avery-0.146
 timer-0.143
 Claus-0.139
Santa-0.135
Token sne
Token ID10505
Feature activation+2.23e-02

Pos logit contributions
OO+0.010
itten+0.010
op+0.010
ending+0.009
OP+0.009
Neg logit contributions
 next-0.012
 Amen-0.011
 timer-0.011
 Claus-0.010
Santa-0.010
Tokenaked
Token ID4335
Feature activation+9.68e-03

Pos logit contributions
itten+0.025
 unnecessary+0.024
OO+0.024
 fault+0.023
 disease+0.023
Neg logit contributions
 next-0.041
 Avery-0.036
 timer-0.034
 Anna-0.034
 Claus-0.034
Token it
Token ID340
Feature activation+3.00e-02

Pos logit contributions
op+0.011
 risks+0.010
OO+0.010
 unnecessary+0.010
itten+0.010
Neg logit contributions
 Amen-0.015
 next-0.014
 Claus-0.014
Anna-0.014
 timer-0.014
Token into
Token ID656
Feature activation-3.53e-03

Pos logit contributions
itten+0.043
OO+0.043
op+0.042
 fault+0.042
age+0.041
Neg logit contributions
 Amen-0.038
iza-0.036
 Avery-0.036
 next-0.036
 waiter-0.036
Token,
Token ID11
Feature activation+2.89e-02

Pos logit contributions
itten+0.075
op+0.074
OO+0.069
 unnecessary+0.068
 risks+0.068
Neg logit contributions
 Amen-0.048
 next-0.046
 timer-0.044
 Claus-0.044
 Avery-0.041
Token with
Token ID351
Feature activation-2.73e-02

Pos logit contributions
itten+0.042
op+0.039
OO+0.039
 pierced+0.039
 cob+0.039
Neg logit contributions
 Amen-0.033
 waiter-0.033
 next-0.033
 menus-0.032
Anna-0.032
Token Mary
Token ID5335
Feature activation+1.91e-02

Pos logit contributions
 next+0.038
 timer+0.032
 Avery+0.032
 Claus+0.032
Santa+0.030
Neg logit contributions
itten-0.042
 unnecessary-0.041
 disease-0.040
 fault-0.039
OO-0.039
Tokenâ
Token ID22940
Feature activation+5.19e-03

Pos logit contributions
itten+0.028
op+0.027
 tal+0.027
 unnecessary+0.026
 fault+0.025
Neg logit contributions
 timer-0.026
 next-0.024
 Amen-0.024
 waiter-0.023
 Avery-0.023
Token
Token ID26391
Feature activation+2.85e-02

Pos logit contributions
itten+0.006
OO+0.006
age+0.006
inter+0.005
 unnecessary+0.005
Neg logit contributions
 next-0.009
 Avery-0.008
 waiter-0.008
 Claus-0.008
 timer-0.008
Token
Token ID8151
Feature activation+1.14e-01

Pos logit contributions
itten+0.024
OO+0.023
 unnecessary+0.022
op+0.022
 fault+0.021
Neg logit contributions
 next-0.060
 Avery-0.054
 timer-0.053
 Claus-0.052
 Anna-0.051
Tokens
Token ID82
Feature activation+3.68e-02

Pos logit contributions
itten+0.176
op+0.171
OO+0.163
 fault+0.162
ending+0.152
Neg logit contributions
 next-0.160
 timer-0.144
 Avery-0.143
 Claus-0.137
 lad-0.135
Token heart
Token ID2612
Feature activation+3.93e-03

Pos logit contributions
itten+0.059
 fault+0.057
op+0.056
 disease+0.055
 unnecessary+0.055
Neg logit contributions
 next-0.049
 timer-0.043
 Avery-0.043
 Claus-0.041
Santa-0.040
Token full
Token ID1336
Feature activation+2.00e-02

Pos logit contributions
itten+0.007
op+0.007
age+0.006
OO+0.006
 pierced+0.006
Neg logit contributions
 next-0.004
 Avery-0.004
 timer-0.004
 Amen-0.004
 waiter-0.004
Token of
Token ID286
Feature activation-3.54e-03

Pos logit contributions
itten+0.026
 tum+0.024
op+0.024
 disease+0.023
 tal+0.023
Neg logit contributions
 Claus-0.031
 Amen-0.029
 next-0.029
 Avery-0.028
 timer-0.028
Token the
Token ID262
Feature activation+1.21e-02

Pos logit contributions
 next+0.006
 Avery+0.005
 Claus+0.005
 timer+0.005
oe+0.005
Neg logit contributions
itten-0.005
 unnecessary-0.005
 disease-0.005
 fault-0.005
 risks-0.004
Token with
Token ID351
Feature activation-3.10e-02

Pos logit contributions
 Avery+0.058
 Amen+0.058
May+0.054
Cl+0.054
Santa+0.052
Neg logit contributions
op-0.069
itten-0.068
 cob-0.064
 spike-0.064
OO-0.064
Token lots
Token ID6041
Feature activation+3.67e-02

Pos logit contributions
 next+0.057
 Avery+0.051
 timer+0.051
 Claus+0.051
 Amen+0.049
Neg logit contributions
itten-0.030
 unnecessary-0.030
OO-0.029
 fault-0.029
 pierced-0.029
Token of
Token ID286
Feature activation+4.97e-03

Pos logit contributions
itten+0.039
op+0.038
 cob+0.036
 risks+0.035
 muscles+0.035
Neg logit contributions
 Amen-0.064
 Claus-0.064
 Avery-0.064
 next-0.063
 Maya-0.058
Token sand
Token ID6450
Feature activation+2.81e-02

Pos logit contributions
 risks+0.006
 unnecessary+0.006
 cob+0.006
itten+0.006
 muscles+0.006
Neg logit contributions
 next-0.008
 Claus-0.008
 Avery-0.007
 timer-0.007
 Amen-0.007
Token and
Token ID290
Feature activation+6.90e-03

Pos logit contributions
op+0.045
 fault+0.044
itten+0.044
ops+0.044
 muscles+0.042
Neg logit contributions
 Amen-0.038
 Avery-0.035
May-0.033
Elsa-0.032
Anna-0.032
Token palm
Token ID18057
Feature activation+1.12e-01

Pos logit contributions
itten+0.008
 unnecessary+0.008
 cob+0.008
 risks+0.008
OO+0.008
Neg logit contributions
 next-0.011
 Avery-0.010
 Claus-0.010
 timer-0.010
oe-0.010
Token trees
Token ID7150
Feature activation+3.80e-02

Pos logit contributions
itten+0.145
op+0.135
 spike+0.132
 pierced+0.130
 cob+0.130
Neg logit contributions
 next-0.174
 Avery-0.169
 Claus-0.160
 Amen-0.160
 timer-0.153
Token.
Token ID13
Feature activation+2.96e-03

Pos logit contributions
itten+0.064
op+0.063
 spike+0.062
 pierced+0.061
 fault+0.059
Neg logit contributions
 Avery-0.045
 Amen-0.039
Anna-0.038
 Maya-0.038
Elsa-0.038
Token There
Token ID1318
Feature activation+3.28e-02

Pos logit contributions
itten+0.004
 unnecessary+0.004
 disease+0.004
OO+0.004
op+0.004
Neg logit contributions
 next-0.005
 Avery-0.004
 Anna-0.004
 timer-0.004
 Claus-0.004
Token was
Token ID373
Feature activation+2.48e-02

Pos logit contributions
 unnecessary+0.062
 disease+0.057
OO+0.057
itten+0.056
 risks+0.056
Neg logit contributions
 next-0.046
 smiled-0.034
 Anna-0.033
 first-0.033
 smiling-0.033
Token a
Token ID257
Feature activation-4.24e-03

Pos logit contributions
itten+0.036
 unnecessary+0.035
OO+0.035
 fault+0.035
 disease+0.034
Neg logit contributions
 next-0.036
 Avery-0.031
 timer-0.030
 Claus-0.029
Santa-0.028
Token became
Token ID2627
Feature activation+1.67e-03

Pos logit contributions
itten+0.035
op+0.034
 pierced+0.034
 disease+0.033
OO+0.033
Neg logit contributions
 next-0.038
 Avery-0.036
May-0.034
 Maya-0.034
 Anna-0.033
Token shall
Token ID2236
Feature activation-3.87e-02

Pos logit contributions
 unnecessary+0.003
 tal+0.002
 risks+0.002
itten+0.002
 pierced+0.002
Neg logit contributions
 next-0.002
 Avery-0.002
 Claus-0.002
 timer-0.002
 Maya-0.002
Tokenower
Token ID789
Feature activation+9.22e-02

Pos logit contributions
 Avery+0.047
 next+0.046
 Maya+0.043
 Anna+0.043
May+0.043
Neg logit contributions
itten-0.061
 fault-0.058
OO-0.058
 pierced-0.058
 unnecessary-0.057
Token and
Token ID290
Feature activation+9.56e-03

Pos logit contributions
itten+0.148
op+0.144
 fault+0.142
ending+0.139
 unnecessary+0.138
Neg logit contributions
 Avery-0.107
Anna-0.103
 Amen-0.102
 next-0.102
iza-0.100
Token shall
Token ID2236
Feature activation-1.85e-02

Pos logit contributions
itten+0.017
 unnecessary+0.017
OO+0.016
 disease+0.016
 fault+0.016
Neg logit contributions
 next-0.011
 Avery-0.009
 timer-0.009
 Claus-0.008
 Anna-0.008
Tokenower
Token ID789
Feature activation+1.09e-01

Pos logit contributions
 next+0.026
 Avery+0.026
May+0.024
 timer+0.024
 Maya+0.024
Neg logit contributions
itten-0.026
 unnecessary-0.024
 fault-0.024
 pierced-0.024
op-0.024
Token until
Token ID1566
Feature activation+1.86e-02

Pos logit contributions
itten+0.168
 fault+0.165
op+0.164
 unnecessary+0.161
OO+0.160
Neg logit contributions
 Avery-0.132
 next-0.130
Anna-0.125
 Amen-0.124
May-0.124
Token the
Token ID262
Feature activation-1.44e-02

Pos logit contributions
 tal+0.015
 unnecessary+0.015
 spike+0.015
apy+0.015
 risks+0.015
Neg logit contributions
 Claus-0.035
 next-0.034
 waiter-0.034
 Avery-0.034
 menus-0.034
Token bear
Token ID6842
Feature activation-1.42e-02

Pos logit contributions
 Avery+0.025
Anna+0.023
Cl+0.023
 Maya+0.023
 Amber+0.023
Neg logit contributions
 spike-0.017
itten-0.016
 unnecessary-0.016
 disease-0.016
 risks-0.016
Token was
Token ID373
Feature activation+1.57e-02

Pos logit contributions
 Avery+0.024
 next+0.024
 Amen+0.023
 Maya+0.022
 Anna+0.022
Neg logit contributions
op-0.016
itten-0.015
age-0.015
 pierced-0.015
 tal-0.015
Token able
Token ID1498
Feature activation+2.34e-03

Pos logit contributions
itten+0.022
 unnecessary+0.021
 fault+0.021
 pierced+0.021
 tal+0.021
Neg logit contributions
 next-0.022
 Avery-0.021
 timer-0.019
 Claus-0.019
Anna-0.019
Token all
Token ID477
Feature activation+4.60e-02

Pos logit contributions
 unnecessary+0.007
 disease+0.007
 tal+0.007
itten+0.007
OO+0.007
Neg logit contributions
 next-0.006
oe-0.006
 timer-0.006
 Avery-0.005
Cl-0.005
Token of
Token ID286
Feature activation+2.32e-02

Pos logit contributions
itten+0.068
 unnecessary+0.065
 tal+0.065
 pierced+0.065
 cob+0.065
Neg logit contributions
 next-0.064
 timer-0.055
 Claus-0.054
 Avery-0.053
Santa-0.053
Token Leo
Token ID19632
Feature activation+1.29e-02

Pos logit contributions
 unnecessary+0.038
itten+0.038
OO+0.037
 disease+0.037
 risks+0.036
Neg logit contributions
 next-0.029
 Avery-0.024
 timer-0.023
 Claus-0.023
oe-0.022
Tokenâ
Token ID22940
Feature activation-2.62e-04

Pos logit contributions
op+0.015
 unnecessary+0.014
OO+0.014
itten+0.014
 risks+0.014
Neg logit contributions
 next-0.021
 timer-0.021
 Amen-0.020
Santa-0.019
oe-0.019
Token
Token ID26391
Feature activation+2.79e-02

Pos logit contributions
 next+0.000
 Avery+0.000
 Anna+0.000
 timer+0.000
 waiter+0.000
Neg logit contributions
itten-0.000
OO-0.000
 unnecessary-0.000
age-0.000
 risks-0.000
Token
Token ID8151
Feature activation+1.08e-01

Pos logit contributions
itten+0.031
 unnecessary+0.030
 disease+0.029
OO+0.029
 fault+0.028
Neg logit contributions
 next-0.052
 Avery-0.045
 Claus-0.044
 Anna-0.043
Santa-0.042
Tokens
Token ID82
Feature activation+3.94e-02

Pos logit contributions
itten+0.160
op+0.153
OO+0.152
 fault+0.148
 unnecessary+0.147
Neg logit contributions
 next-0.160
 Avery-0.139
 timer-0.134
 Claus-0.130
Santa-0.127
Token lion
Token ID18744
Feature activation-1.76e-03

Pos logit contributions
itten+0.055
 unnecessary+0.054
OO+0.053
op+0.052
 disease+0.052
Neg logit contributions
 next-0.060
 Avery-0.052
 timer-0.051
 Claus-0.049
 Anna-0.049
Token friends
Token ID2460
Feature activation-1.45e-03

Pos logit contributions
 next+0.002
 Avery+0.002
 timer+0.002
 Amen+0.002
Santa+0.002
Neg logit contributions
op-0.003
itten-0.002
 unnecessary-0.002
 fault-0.002
OO-0.002
Token reached
Token ID4251
Feature activation+3.90e-02

Pos logit contributions
 next+0.002
 timer+0.002
Santa+0.002
Anna+0.002
 Avery+0.002
Neg logit contributions
op-0.002
OO-0.002
itten-0.002
 weren-0.002
're-0.002
Token the
Token ID262
Feature activation-1.96e-02

Pos logit contributions
 unnecessary+0.047
 tal+0.047
 fault+0.046
 risks+0.044
OO+0.044
Neg logit contributions
 next-0.061
 timer-0.059
ider-0.057
 Claus-0.057
Cl-0.056
Token.
Token ID13
Feature activation+2.19e-04

Pos logit contributions
 Avery+0.012
 timer+0.011
 Elsa+0.011
 Amen+0.011
 Claus+0.011
Neg logit contributions
 perspective-0.016
age-0.016
op-0.015
 fault-0.015
itten-0.015
Token With
Token ID2080
Feature activation-4.82e-02

Pos logit contributions
OO+0.000
itten+0.000
 fault+0.000
op+0.000
 unnecessary+0.000
Neg logit contributions
 next-0.000
 timer-0.000
 Avery-0.000
 Claus-0.000
 bride-0.000
Token Joe
Token ID5689
Feature activation+3.56e-02

Pos logit contributions
 next+0.058
 Avery+0.045
 Anna+0.043
 smiling+0.040
 kindly+0.040
Neg logit contributions
itten-0.089
OO-0.087
 unnecessary-0.085
op-0.085
 disease-0.084
Tokenâ
Token ID22940
Feature activation+2.19e-03

Pos logit contributions
itten+0.046
op+0.045
 unnecessary+0.044
ending+0.041
 tal+0.041
Neg logit contributions
 next-0.055
 timer-0.054
 Avery-0.052
 Amen-0.052
Santa-0.050
Token
Token ID26391
Feature activation+2.86e-02

Pos logit contributions
itten+0.002
OO+0.002
age+0.002
 unnecessary+0.002
inter+0.002
Neg logit contributions
 next-0.004
 Avery-0.004
 timer-0.003
 Anna-0.003
 waiter-0.003
Token
Token ID8151
Feature activation+1.06e-01

Pos logit contributions
itten+0.031
 unnecessary+0.031
 disease+0.031
OO+0.030
 fault+0.029
Neg logit contributions
 next-0.053
 Avery-0.046
 Claus-0.045
oe-0.044
 Anna-0.044
Tokens
Token ID82
Feature activation+4.12e-02

Pos logit contributions
itten+0.158
op+0.151
OO+0.149
 unnecessary+0.147
 fault+0.145
Neg logit contributions
 next-0.157
 Avery-0.136
 timer-0.134
 Claus-0.128
 Anna-0.125
Token help
Token ID1037
Feature activation+6.23e-03

Pos logit contributions
itten+0.062
OO+0.061
 unnecessary+0.060
 disease+0.058
op+0.057
Neg logit contributions
 next-0.061
 Avery-0.051
 Anna-0.048
 Claus-0.047
 timer-0.047
Token,
Token ID11
Feature activation+2.10e-02

Pos logit contributions
itten+0.010
 unnecessary+0.010
OO+0.010
op+0.010
 disease+0.010
Neg logit contributions
 next-0.008
 Avery-0.007
 timer-0.006
 Claus-0.006
 Anna-0.006
Token Steve
Token ID6542
Feature activation+1.03e-02

Pos logit contributions
 unnecessary+0.028
itten+0.028
 risks+0.028
op+0.027
 muscles+0.027
Neg logit contributions
 next-0.029
 timer-0.028
 Claus-0.027
 Amen-0.027
 Avery-0.026
Token started
Token ID2067
Feature activation+1.27e-02

Pos logit contributions
op+0.013
itten+0.013
ched+0.012
age+0.012
 pierced+0.012
Neg logit contributions
 timer-0.016
 next-0.016
 Amen-0.015
 waiter-0.014
 Avery-0.014
Token were
Token ID547
Feature activation+2.83e-02

Pos logit contributions
 next+0.003
 timer+0.003
Cl+0.003
 Avery+0.003
ider+0.003
Neg logit contributions
op-0.003
OO-0.003
itten-0.003
're-0.003
 happened-0.002
Token in
Token ID287
Feature activation+2.46e-03

Pos logit contributions
op+0.027
itten+0.027
OO+0.027
 fault+0.026
 muscles+0.025
Neg logit contributions
 timer-0.047
 keeper-0.047
guard-0.047
 Claus-0.047
Anna-0.047
Token Tommy
Token ID19919
Feature activation+6.24e-04

Pos logit contributions
itten+0.004
 unnecessary+0.004
 muscles+0.004
 disease+0.004
 fault+0.004
Neg logit contributions
 next-0.003
 timer-0.003
 Claus-0.003
 Avery-0.003
maid-0.003
Tokenâ
Token ID22940
Feature activation-1.14e-02

Pos logit contributions
op+0.001
itten+0.001
OO+0.001
 muscles+0.001
 tal+0.001
Neg logit contributions
 next-0.001
Anna-0.001
 timer-0.001
 waiter-0.001
Cl-0.001
Token
Token ID26391
Feature activation+2.46e-02

Pos logit contributions
 next+0.019
maid+0.018
 waiter+0.018
 sailors+0.018
 mer+0.018
Neg logit contributions
itten-0.014
age-0.013
ilst-0.013
inter-0.012
OO-0.012
Token
Token ID8151
Feature activation+1.06e-01

Pos logit contributions
itten+0.048
 unnecessary+0.047
OO+0.047
op+0.046
 disease+0.046
Neg logit contributions
 next-0.025
 Avery-0.019
 timer-0.018
 Claus-0.017
 Anna-0.017
Tokens
Token ID82
Feature activation+3.42e-02

Pos logit contributions
itten+0.158
op+0.152
OO+0.149
 fault+0.146
 unnecessary+0.140
Neg logit contributions
 next-0.158
 Avery-0.138
 timer-0.135
 Claus-0.130
Santa-0.128
Token bedroom
Token ID14043
Feature activation-1.49e-02

Pos logit contributions
 muscles+0.047
 cob+0.047
 fault+0.047
itten+0.046
 tal+0.045
Neg logit contributions
 timer-0.044
 next-0.043
Santa-0.043
Cl-0.043
Anna-0.041
Token.
Token ID13
Feature activation+7.10e-03

Pos logit contributions
 Avery+0.015
 Amen+0.015
Anna+0.014
 next+0.014
Santa+0.014
Neg logit contributions
op-0.026
OO-0.026
itten-0.025
 fault-0.025
 muscles-0.024
Token Tommy
Token ID19919
Feature activation-1.38e-02

Pos logit contributions
itten+0.011
op+0.011
OO+0.010
 muscles+0.010
inter+0.010
Neg logit contributions
 next-0.010
 timer-0.009
 Avery-0.008
 Claus-0.008
maid-0.008
Token was
Token ID373
Feature activation+1.40e-02

Pos logit contributions
 next+0.023
 timer+0.021
 Avery+0.020
 menus+0.019
 waiter+0.019
Neg logit contributions
itten-0.017
op-0.017
OO-0.016
 pierced-0.015
 spike-0.015
Token green
Token ID4077
Feature activation+4.23e-02

Pos logit contributions
 next+0.030
 Avery+0.029
Cl+0.027
 Claire+0.027
oe+0.027
Neg logit contributions
 cob-0.025
 spike-0.025
itten-0.024
 disease-0.024
 hamm-0.023
Token and
Token ID290
Feature activation-2.81e-02

Pos logit contributions
OO+0.053
itten+0.052
 spike+0.052
 tal+0.049
 cob+0.049
Neg logit contributions
 next-0.065
 Avery-0.065
 Maya-0.061
Cl-0.061
May-0.059
Token blue
Token ID4171
Feature activation+3.56e-02

Pos logit contributions
 next+0.039
 Avery+0.037
oe+0.037
 timer+0.036
maid+0.036
Neg logit contributions
 pierced-0.035
eware-0.035
 cob-0.034
 tal-0.034
itten-0.034
Token land
Token ID1956
Feature activation-1.63e-02

Pos logit contributions
itten+0.047
OO+0.047
 tal+0.045
op+0.045
 spike+0.044
Neg logit contributions
 next-0.053
 Avery-0.051
 Maya-0.049
Cl-0.049
 Anna-0.048
Token with
Token ID351
Feature activation-2.97e-02

Pos logit contributions
 Avery+0.024
May+0.024
 Amen+0.023
Elsa+0.023
 Maya+0.022
Neg logit contributions
op-0.022
 bars-0.021
 pierced-0.020
itten-0.020
adders-0.020
Token palm
Token ID18057
Feature activation+1.06e-01

Pos logit contributions
 next+0.046
 Claus+0.042
 Avery+0.041
 timer+0.041
oe+0.040
Neg logit contributions
 unnecessary-0.037
 risks-0.037
 cob-0.036
 muscles-0.036
itten-0.036
Token trees
Token ID7150
Feature activation+2.73e-02

Pos logit contributions
itten+0.138
op+0.132
 tal+0.126
 spike+0.125
 cob+0.123
Neg logit contributions
 next-0.168
 Avery-0.165
 Maya-0.153
 Amen-0.153
 Claus-0.152
Token and
Token ID290
Feature activation+7.74e-04

Pos logit contributions
itten+0.045
op+0.044
 bars+0.043
 spike+0.042
 fault+0.041
Neg logit contributions
 Avery-0.034
iza-0.029
Anna-0.029
 Maya-0.029
 waiter-0.029
Token flowers
Token ID12734
Feature activation+3.48e-02

Pos logit contributions
 unnecessary+0.001
 cob+0.001
 muscles+0.001
 risks+0.001
 tal+0.001
Neg logit contributions
 next-0.001
oe-0.001
 Avery-0.001
 timer-0.001
 Amen-0.001
Token.
Token ID13
Feature activation+9.47e-03

Pos logit contributions
itten+0.054
 bars+0.053
op+0.052
 pierced+0.050
 cob+0.050
Neg logit contributions
 Avery-0.045
 Amen-0.041
Anna-0.041
May-0.040
 Maya-0.039
Token "
Token ID366
Feature activation+1.38e-02

Pos logit contributions
itten+0.017
OO+0.017
 unnecessary+0.016
op+0.016
 fault+0.016
Neg logit contributions
 next-0.011
 Avery-0.009
 timer-0.009
 Claus-0.008
Santa-0.008
Token.
Token ID13
Feature activation+1.51e-02

Pos logit contributions
 unnecessary+0.037
 risks+0.036
 cob+0.036
itten+0.036
 fault+0.035
Neg logit contributions
 Amen-0.026
 timer-0.026
 Claus-0.026
 waiter-0.025
 Avery-0.025
Token As
Token ID1081
Feature activation-1.71e-02

Pos logit contributions
itten+0.025
OO+0.024
 unnecessary+0.024
 fault+0.024
op+0.024
Neg logit contributions
 next-0.020
 timer-0.016
 Avery-0.016
 Claus-0.015
Santa-0.015
Token Daisy
Token ID40355
Feature activation+1.33e-02

Pos logit contributions
 next+0.020
 timer+0.020
 Claus+0.019
 Amen+0.019
 waiter+0.018
Neg logit contributions
 unnecessary-0.028
 risks-0.026
 disease-0.026
 cob-0.025
itten-0.025
Tokenâ
Token ID22940
Feature activation+4.32e-03

Pos logit contributions
itten+0.018
op+0.017
 pierced+0.017
OO+0.017
 tal+0.017
Neg logit contributions
 next-0.020
 Avery-0.018
 timer-0.018
 Amen-0.017
Santa-0.017
Token
Token ID26391
Feature activation+3.02e-02

Pos logit contributions
itten+0.005
OO+0.005
 unnecessary+0.005
op+0.005
 disease+0.005
Neg logit contributions
 next-0.007
 Avery-0.006
 timer-0.006
 Claus-0.006
 Anna-0.006
Token
Token ID8151
Feature activation+1.06e-01

Pos logit contributions
itten+0.041
 unnecessary+0.039
OO+0.038
 disease+0.038
 fault+0.037
Neg logit contributions
 next-0.049
 Avery-0.042
 Claus-0.040
 Anna-0.040
 timer-0.040
Tokens
Token ID82
Feature activation+3.06e-02

Pos logit contributions
itten+0.156
 unnecessary+0.154
OO+0.151
 disease+0.148
 fault+0.147
Neg logit contributions
 next-0.155
 Avery-0.131
 timer-0.125
 Anna-0.123
 Claus-0.123
Token gaze
Token ID17841
Feature activation-2.42e-02

Pos logit contributions
itten+0.039
 unnecessary+0.039
 fault+0.039
 disease+0.038
 risks+0.038
Neg logit contributions
 next-0.047
 timer-0.044
 Avery-0.044
 Claus-0.042
Santa-0.042
Token passed
Token ID3804
Feature activation-1.62e-02

Pos logit contributions
 next+0.034
 timer+0.033
 Amen+0.032
Anna+0.030
 Claus+0.030
Neg logit contributions
 unnecessary-0.033
itten-0.032
OO-0.032
op-0.032
 pierced-0.032
Token by
Token ID416
Feature activation-1.50e-03

Pos logit contributions
 timer+0.023
 Amen+0.023
 Claus+0.022
 next+0.022
 waiter+0.021
Neg logit contributions
 risks-0.023
 unnecessary-0.022
 cob-0.021
itten-0.020
 pierced-0.020
Token,
Token ID11
Feature activation+2.09e-02

Pos logit contributions
 timer+0.002
 Claus+0.002
 Amen+0.002
 next+0.002
 waiter+0.002
Neg logit contributions
 cob-0.002
 risks-0.002
 unnecessary-0.002
 fault-0.002
 disease-0.002
Token,
Token ID11
Feature activation+1.06e-02

Pos logit contributions
itten+0.080
op+0.075
 fault+0.075
OO+0.075
 unnecessary+0.074
Neg logit contributions
 Avery-0.040
 Amen-0.040
 next-0.039
 Claus-0.037
Santa-0.036
Token and
Token ID290
Feature activation+4.54e-03

Pos logit contributions
itten+0.017
 unnecessary+0.016
 disease+0.016
 fault+0.016
 sap+0.016
Neg logit contributions
 next-0.013
 Claus-0.012
 Avery-0.012
 Amen-0.012
 timer-0.011
Token Pete
Token ID17542
Feature activation+1.56e-02

Pos logit contributions
itten+0.007
 unnecessary+0.007
OO+0.007
 disease+0.007
op+0.006
Neg logit contributions
 next-0.006
 timer-0.005
 Claus-0.005
 Avery-0.005
 Amen-0.005
Tokenâ
Token ID22940
Feature activation+3.28e-03

Pos logit contributions
itten+0.021
op+0.021
OO+0.020
 pierced+0.019
 tal+0.019
Neg logit contributions
 next-0.024
 Avery-0.021
 timer-0.021
 Amen-0.021
Santa-0.020
Token
Token ID26391
Feature activation+2.61e-02

Pos logit contributions
itten+0.004
OO+0.004
 unnecessary+0.004
op+0.004
 disease+0.004
Neg logit contributions
 next-0.006
 Avery-0.005
 timer-0.005
 Claus-0.005
 Anna-0.005
Token
Token ID8151
Feature activation+1.05e-01

Pos logit contributions
 unnecessary+0.033
itten+0.033
 disease+0.032
OO+0.031
 fault+0.030
Neg logit contributions
 next-0.044
 Avery-0.038
 Claus-0.037
 Anna-0.036
oe-0.036
Tokens
Token ID82
Feature activation+3.54e-02

Pos logit contributions
itten+0.162
 unnecessary+0.157
OO+0.156
 disease+0.152
op+0.152
Neg logit contributions
 next-0.148
 Avery-0.124
 timer-0.119
 Claus-0.117
 Anna-0.116
Token heart
Token ID2612
Feature activation+4.74e-03

Pos logit contributions
itten+0.049
 unnecessary+0.049
OO+0.047
 disease+0.045
eware+0.044
Neg logit contributions
 next-0.056
 Avery-0.047
 Anna-0.046
 smiling-0.045
 timer-0.044
Token raced
Token ID32897
Feature activation-8.87e-03

Pos logit contributions
itten+0.007
 unnecessary+0.007
OO+0.007
 disease+0.006
op+0.006
Neg logit contributions
 next-0.007
 Avery-0.006
 timer-0.006
 Claus-0.006
 Anna-0.006
Token as
Token ID355
Feature activation-5.43e-03

Pos logit contributions
 next+0.012
 Claus+0.012
 timer+0.012
 Amen+0.012
 Avery+0.011
Neg logit contributions
itten-0.012
 unnecessary-0.012
 risks-0.012
OO-0.011
op-0.011
Token he
Token ID339
Feature activation-1.73e-02

Pos logit contributions
 timer+0.007
 Amen+0.007
 Claus+0.007
 menus+0.006
 Avery+0.006
Neg logit contributions
apy-0.009
op-0.008
 unnecessary-0.008
 spike-0.008
 footing-0.008
Token and
Token ID290
Feature activation+5.04e-03

Pos logit contributions
itten+0.028
 pierced+0.027
op+0.027
 disease+0.026
age+0.026
Neg logit contributions
 next-0.032
 Avery-0.031
 Amen-0.029
Santa-0.029
Cl-0.029
Token the
Token ID262
Feature activation-3.43e-03

Pos logit contributions
itten+0.009
 unnecessary+0.009
OO+0.009
 disease+0.009
op+0.009
Neg logit contributions
 next-0.006
 Avery-0.005
 timer-0.004
 Claus-0.004
 Anna-0.004
Token boy
Token ID2933
Feature activation+6.26e-03

Pos logit contributions
 next+0.004
 Avery+0.004
oe+0.004
 timer+0.003
 Claus+0.003
Neg logit contributions
 unnecessary-0.006
itten-0.006
 disease-0.005
 risks-0.005
 spike-0.005
Tokenâ
Token ID22940
Feature activation-2.09e-03

Pos logit contributions
itten+0.007
op+0.007
 pierced+0.007
OO+0.007
 tal+0.007
Neg logit contributions
 next-0.011
 Avery-0.009
 timer-0.009
Santa-0.009
 waiter-0.009
Token
Token ID26391
Feature activation+2.75e-02

Pos logit contributions
 next+0.004
 Avery+0.003
 timer+0.003
 Claus+0.003
 Anna+0.003
Neg logit contributions
itten-0.003
OO-0.003
 unnecessary-0.002
op-0.002
 disease-0.002
Token
Token ID8151
Feature activation+1.05e-01

Pos logit contributions
itten+0.051
 unnecessary+0.051
 disease+0.050
OO+0.049
 risks+0.048
Neg logit contributions
 next-0.030
 Avery-0.023
 Claus-0.023
 Anna-0.022
oe-0.021
Tokens
Token ID82
Feature activation+3.68e-02

Pos logit contributions
itten+0.161
OO+0.154
 unnecessary+0.153
op+0.152
 fault+0.149
Neg logit contributions
 next-0.149
 Avery-0.127
 timer-0.121
 Claus-0.119
 Anna-0.116
Token cane
Token ID33009
Feature activation+3.67e-02

Pos logit contributions
itten+0.052
 unnecessary+0.050
OO+0.050
op+0.048
 disease+0.048
Neg logit contributions
 next-0.057
 Avery-0.049
 timer-0.047
 Claus-0.046
 Anna-0.046
Token looked
Token ID3114
Feature activation-1.23e-02

Pos logit contributions
itten+0.046
op+0.044
 unnecessary+0.044
OO+0.044
 fault+0.043
Neg logit contributions
 next-0.060
 Avery-0.054
 timer-0.051
Santa-0.051
 Anna-0.050
Token very
Token ID845
Feature activation-2.44e-02

Pos logit contributions
Anna+0.021
 next+0.021
 Claus+0.021
 timer+0.020
 Avery+0.020
Neg logit contributions
 unnecessary-0.014
 risks-0.013
 fault-0.012
OO-0.012
 cob-0.012
Token similar
Token ID2092
Feature activation+1.80e-02

Pos logit contributions
 Claus+0.039
 next+0.039
 timer+0.039
Anna+0.038
 Amen+0.038
Neg logit contributions
 unnecessary-0.029
 risks-0.026
apy-0.026
 cob-0.025
 mattered-0.024
Token Mom
Token ID11254
Feature activation+6.03e-03

Pos logit contributions
 timer+0.010
 menus+0.009
 next+0.009
Say+0.009
 Claus+0.009
Neg logit contributions
itten-0.008
OO-0.008
age-0.008
op-0.007
 tal-0.007
Token.
Token ID13
Feature activation+3.57e-03

Pos logit contributions
itten+0.008
age+0.007
op+0.006
 fault+0.006
're+0.006
Neg logit contributions
 timer-0.010
Say-0.010
Cl-0.009
 Avery-0.009
 mer-0.009
Token You
Token ID921
Feature activation+1.24e-02

Pos logit contributions
itten+0.005
 Everything+0.005
op+0.004
 Nut+0.004
 Timothy+0.004
Neg logit contributions
 timer-0.006
 next-0.006
 menus-0.005
anut-0.005
kins-0.005
Tokenâ
Token ID22940
Feature activation+1.38e-02

Pos logit contributions
itten+0.010
 unnecessary+0.009
OO+0.009
 disease+0.009
op+0.009
Neg logit contributions
 next-0.027
 Avery-0.024
 Claus-0.024
 timer-0.023
 Anna-0.023
Token
Token ID26391
Feature activation+2.50e-02

Pos logit contributions
itten+0.016
OO+0.016
 unnecessary+0.015
age+0.015
 risks+0.015
Neg logit contributions
 next-0.024
 Avery-0.021
 timer-0.021
 Anna-0.020
 Claus-0.020
Token
Token ID8151
Feature activation+1.05e-01

Pos logit contributions
itten+0.034
op+0.033
OO+0.032
 unnecessary+0.032
 fault+0.031
Neg logit contributions
 next-0.040
 timer-0.035
 Avery-0.035
 waiter-0.032
 Claire-0.032
Tokenre
Token ID260
Feature activation+2.90e-02

Pos logit contributions
itten+0.148
op+0.141
OO+0.141
 fault+0.135
 unnecessary+0.133
Neg logit contributions
 next-0.164
 Avery-0.143
 timer-0.139
 Claus-0.134
Santa-0.132
Token very
Token ID845
Feature activation-1.45e-04

Pos logit contributions
itten+0.045
 unnecessary+0.045
OO+0.045
 tal+0.044
 fault+0.043
Neg logit contributions
 next-0.037
 timer-0.033
 Avery-0.032
 Claus-0.030
Santa-0.030
Token powerful
Token ID3665
Feature activation+2.89e-02

Pos logit contributions
 next+0.000
 timer+0.000
 Avery+0.000
Santa+0.000
 Claus+0.000
Neg logit contributions
itten-0.000
 unnecessary-0.000
op-0.000
uch-0.000
OO-0.000
Token!
Token ID0
Feature activation+2.12e-02

Pos logit contributions
 fault+0.042
 muscles+0.040
itten+0.039
 risks+0.038
op+0.038
Neg logit contributions
 timer-0.041
 Claus-0.038
Santa-0.038
Cl-0.038
 Avery-0.037
Tokenâ
Token ID22940
Feature activation+1.78e-03

Pos logit contributions
itten+0.031
op+0.029
 Everything+0.028
ending+0.027
 tal+0.027
Neg logit contributions
 timer-0.033
 next-0.031
 menus-0.028
maid-0.027
 Claus-0.027
Token chair
Token ID5118
Feature activation-6.93e-03

Pos logit contributions
 Avery+0.000
ory+0.000
 next+0.000
 Claus+0.000
Cl+0.000
Neg logit contributions
 cob-0.000
 unnecessary-0.000
 disease-0.000
 risks-0.000
inter-0.000
Token and
Token ID290
Feature activation+8.57e-03

Pos logit contributions
 Avery+0.007
Santa+0.007
 Amen+0.007
Cl+0.007
May+0.007
Neg logit contributions
age-0.012
op-0.012
itten-0.012
 fault-0.012
bar-0.011
Token we
Token ID356
Feature activation+2.98e-02

Pos logit contributions
 unnecessary+0.012
itten+0.012
 fault+0.012
 risks+0.012
 disease+0.012
Neg logit contributions
 next-0.012
 timer-0.010
Cl-0.010
 Avery-0.010
Santa-0.010
Tokenâ
Token ID22940
Feature activation+5.37e-03

Pos logit contributions
itten+0.041
 unnecessary+0.040
OO+0.040
op+0.040
 fault+0.039
Neg logit contributions
 next-0.046
 Avery-0.040
 timer-0.039
 Claus-0.038
 Anna-0.037
Token
Token ID26391
Feature activation+2.23e-02

Pos logit contributions
itten+0.006
OO+0.006
 unnecessary+0.006
op+0.006
 disease+0.006
Neg logit contributions
 next-0.009
 Avery-0.008
 timer-0.008
 Claus-0.008
 Anna-0.008
Token
Token ID8151
Feature activation+1.04e-01

Pos logit contributions
itten+0.019
 unnecessary+0.018
OO+0.018
op+0.018
 disease+0.017
Neg logit contributions
 next-0.046
 Avery-0.041
 timer-0.040
 Claus-0.040
 Anna-0.040
Tokenll
Token ID297
Feature activation+1.18e-03

Pos logit contributions
itten+0.147
op+0.142
OO+0.137
 fault+0.135
 unnecessary+0.129
Neg logit contributions
 next-0.163
 Avery-0.148
 timer-0.143
 Claus-0.139
Santa-0.136
Token finish
Token ID5461
Feature activation-2.40e-02

Pos logit contributions
itten+0.002
age+0.001
op+0.001
OO+0.001
 unnecessary+0.001
Neg logit contributions
 next-0.002
 Avery-0.002
 Claus-0.002
 timer-0.002
 waiter-0.002
Token the
Token ID262
Feature activation-1.40e-02

Pos logit contributions
 Claus+0.037
 waiter+0.036
 keeper+0.034
 timer+0.034
 Amen+0.033
Neg logit contributions
 fault-0.032
 unnecessary-0.031
op-0.029
 cob-0.029
 risks-0.029
Token rest
Token ID1334
Feature activation-2.34e-02

Pos logit contributions
 next+0.022
 Avery+0.021
 Claus+0.021
ory+0.021
 Amber+0.020
Neg logit contributions
 unnecessary-0.017
 cob-0.017
itten-0.017
 disease-0.016
 risks-0.016
Token of
Token ID286
Feature activation-4.94e-03

Pos logit contributions
 Amen+0.032
 Claus+0.032
Santa+0.029
 Prince+0.028
Friend+0.028
Neg logit contributions
op-0.038
itten-0.037
 fault-0.033
ops-0.032
akes-0.032
Token for
Token ID329
Feature activation+3.11e-02

Pos logit contributions
itten+0.090
op+0.090
 fault+0.089
 unnecessary+0.088
 risks+0.088
Neg logit contributions
 next-0.039
 Avery-0.035
 timer-0.031
Santa-0.031
 Claus-0.029
Token Strip
Token ID18508
Feature activation-7.57e-03

Pos logit contributions
itten+0.058
OO+0.054
op+0.054
eware+0.053
inter+0.053
Neg logit contributions
 next-0.038
 Anna-0.030
 Avery-0.029
 timer-0.028
 smiling-0.028
Tokenes
Token ID274
Feature activation+2.45e-02

Pos logit contributions
 next+0.013
 timer+0.012
 Avery+0.011
 Amen+0.011
Santa+0.011
Neg logit contributions
op-0.009
 unnecessary-0.009
OO-0.009
itten-0.009
 risks-0.009
Tokenâ
Token ID22940
Feature activation+4.61e-03

Pos logit contributions
itten+0.045
op+0.045
 fault+0.044
 unnecessary+0.044
 disease+0.044
Neg logit contributions
 next-0.024
 Avery-0.022
 timer-0.020
 Amen-0.019
Santa-0.019
Token
Token ID26391
Feature activation+3.34e-02

Pos logit contributions
itten+0.006
OO+0.006
 unnecessary+0.005
age+0.005
 risks+0.005
Neg logit contributions
 next-0.008
 Avery-0.007
 timer-0.007
 Anna-0.007
 Claus-0.007
Token
Token ID8151
Feature activation+1.04e-01

Pos logit contributions
itten+0.030
 unnecessary+0.029
OO+0.028
 disease+0.027
 fault+0.026
Neg logit contributions
 next-0.069
 Avery-0.061
 Claus-0.059
 Anna-0.059
 timer-0.058
Token family
Token ID1641
Feature activation+2.08e-02

Pos logit contributions
itten+0.170
op+0.169
 fault+0.160
OO+0.158
ending+0.151
Neg logit contributions
 next-0.135
 Avery-0.123
 timer-0.121
 Amen-0.114
 lad-0.114
Token.
Token ID13
Feature activation+3.11e-04

Pos logit contributions
itten+0.041
op+0.041
 fault+0.040
 unnecessary+0.039
 risks+0.039
Neg logit contributions
 next-0.018
 timer-0.016
 Avery-0.016
Santa-0.014
 Amen-0.014
Token Spot
Token ID15899
Feature activation-1.44e-03

Pos logit contributions
op+0.000
 fault+0.000
 muscles+0.000
 unnecessary+0.000
 Ow+0.000
Neg logit contributions
 next-0.000
 timer-0.000
 Avery-0.000
iza-0.000
maid-0.000
Token finally
Token ID3443
Feature activation-4.22e-03

Pos logit contributions
 next+0.002
 Avery+0.002
 timer+0.002
 Amen+0.002
Santa+0.002
Neg logit contributions
op-0.002
itten-0.002
OO-0.002
 pierced-0.002
 unnecessary-0.002
Token found
Token ID1043
Feature activation+1.38e-02

Pos logit contributions
 next+0.006
 Avery+0.005
 timer+0.005
 Claus+0.005
 Anna+0.005
Neg logit contributions
itten-0.006
OO-0.006
 unnecessary-0.006
 disease-0.006
op-0.006
Token favourite
Token ID12507
Feature activation+1.35e-02

Pos logit contributions
 fault+0.015
 muscles+0.014
 unnecessary+0.014
 footing+0.014
OO+0.014
Neg logit contributions
 Amen-0.017
Santa-0.017
 Avery-0.017
 next-0.017
 Claus-0.016
Token.
Token ID13
Feature activation+7.73e-03

Pos logit contributions
itten+0.018
OO+0.017
op+0.017
 fault+0.017
 disease+0.017
Neg logit contributions
 next-0.020
 Avery-0.019
 Claus-0.019
Santa-0.018
 Amen-0.017
Token Toby
Token ID37578
Feature activation-5.11e-03

Pos logit contributions
op+0.010
 Everything+0.010
inter+0.009
itten+0.009
apy+0.009
Neg logit contributions
 next-0.013
 timer-0.012
 keeper-0.011
 bride-0.011
 priest-0.011
Tokenâ
Token ID22940
Feature activation-3.47e-04

Pos logit contributions
 next+0.008
 Amen+0.007
Santa+0.007
 timer+0.007
 Avery+0.007
Neg logit contributions
op-0.007
itten-0.007
OO-0.006
 pierced-0.006
 tal-0.006
Token
Token ID26391
Feature activation+2.67e-02

Pos logit contributions
 next+0.001
 Avery+0.001
 timer+0.001
 waiter+0.001
 Anna+0.001
Neg logit contributions
itten-0.000
OO-0.000
age-0.000
inter-0.000
op-0.000
Token
Token ID8151
Feature activation+1.04e-01

Pos logit contributions
itten+0.035
 unnecessary+0.033
OO+0.033
op+0.032
 disease+0.032
Neg logit contributions
 next-0.044
 Avery-0.038
 timer-0.037
 Claus-0.036
 Anna-0.036
Tokens
Token ID82
Feature activation+2.54e-02

Pos logit contributions
itten+0.155
op+0.148
OO+0.147
 fault+0.142
 unnecessary+0.139
Neg logit contributions
 next-0.154
 Avery-0.131
 timer-0.130
 Claus-0.126
Santa-0.123
Token cream
Token ID8566
Feature activation-2.23e-03

Pos logit contributions
itten+0.035
 fault+0.035
OO+0.034
op+0.034
 disease+0.033
Neg logit contributions
 next-0.038
Santa-0.032
 Avery-0.032
 Claus-0.032
 timer-0.032
Token won
Token ID1839
Feature activation+1.01e-02

Pos logit contributions
 next+0.003
 Avery+0.003
 Amen+0.003
Santa+0.003
Cl+0.003
Neg logit contributions
itten-0.003
op-0.003
OO-0.003
 fault-0.003
 spike-0.003
Token!
Token ID0
Feature activation+2.14e-02

Pos logit contributions
itten+0.015
op+0.015
OO+0.015
 fault+0.015
 pierced+0.014
Neg logit contributions
 next-0.013
 timer-0.012
 Avery-0.012
 waiter-0.012
 Amen-0.012
Token The
Token ID383
Feature activation-1.31e-02

Pos logit contributions
op+0.027
itten+0.027
 Everything+0.027
inter+0.026
 tal+0.025
Neg logit contributions
 next-0.035
 timer-0.035
 Claus-0.030
 bride-0.030
 priest-0.029
Token restless
Token ID42537
Feature activation+3.50e-02

Pos logit contributions
 unnecessary+0.043
itten+0.042
 fault+0.042
 disease+0.041
age+0.040
Neg logit contributions
 next-0.045
 Avery-0.041
 timer-0.041
 Claus-0.039
 lad-0.038
Token and
Token ID290
Feature activation+2.56e-03

Pos logit contributions
 fault+0.057
itten+0.054
 muscles+0.054
 disease+0.054
 unnecessary+0.054
Neg logit contributions
 next-0.038
 Avery-0.037
 Claus-0.037
Santa-0.036
Cl-0.036
Token couldn
Token ID3521
Feature activation+2.45e-02

Pos logit contributions
 disease+0.003
 weren+0.003
 unnecessary+0.003
 injury+0.003
 fault+0.003
Neg logit contributions
 next-0.004
 timer-0.004
Anna-0.004
oe-0.004
Cl-0.004
Tokenâ
Token ID22940
Feature activation+1.15e-02

Pos logit contributions
itten+0.031
 unnecessary+0.030
OO+0.030
op+0.029
 disease+0.029
Neg logit contributions
 next-0.041
 Avery-0.036
 timer-0.034
 Claus-0.034
 Anna-0.033
Token
Token ID26391
Feature activation+3.83e-02

Pos logit contributions
itten+0.015
 unnecessary+0.014
OO+0.014
op+0.014
 disease+0.014
Neg logit contributions
 next-0.019
 Avery-0.017
 timer-0.016
 Claus-0.016
 Anna-0.016
Token
Token ID8151
Feature activation+1.04e-01

Pos logit contributions
itten+0.038
 unnecessary+0.036
 disease+0.035
OO+0.034
 spike+0.033
Neg logit contributions
 next-0.075
 Claus-0.066
 Avery-0.065
 Anna-0.063
oe-0.062
Tokent
Token ID83
Feature activation+3.01e-02

Pos logit contributions
itten+0.088
op+0.081
OO+0.080
 unnecessary+0.078
 fault+0.075
Neg logit contributions
 next-0.219
 Avery-0.199
 timer-0.194
 Claus-0.190
 Anna-0.188
Token get
Token ID651
Feature activation-9.70e-03

Pos logit contributions
itten+0.038
op+0.036
 unnecessary+0.036
 disease+0.035
OO+0.035
Neg logit contributions
 next-0.049
 Avery-0.044
 timer-0.043
 Claus-0.042
 menus-0.041
Token to
Token ID284
Feature activation+1.94e-02

Pos logit contributions
 next+0.014
 timer+0.014
Anna+0.014
 Avery+0.014
May+0.013
Neg logit contributions
 unnecessary-0.013
 disease-0.012
 risks-0.012
 fault-0.012
itten-0.012
Token sleep
Token ID3993
Feature activation-4.55e-02

Pos logit contributions
itten+0.026
 unnecessary+0.026
 disease+0.025
 fault+0.025
OO+0.025
Neg logit contributions
 next-0.030
 Avery-0.026
 timer-0.026
 Claus-0.025
oe-0.025
Token.
Token ID13
Feature activation+3.56e-03

Pos logit contributions
 Avery+0.048
 Claus+0.046
 next+0.046
 waiter+0.046
Anna+0.045
Neg logit contributions
 disease-0.072
 injury-0.070
 unnecessary-0.069
op-0.068
 risks-0.068
Token.
Token ID13
Feature activation+6.54e-03

Pos logit contributions
 fault+0.001
 risks+0.001
 unnecessary+0.001
 muscles+0.001
op+0.001
Neg logit contributions
 Amen-0.001
 Claus-0.001
Anna-0.001
Santa-0.001
 timer-0.001
Token On
Token ID1550
Feature activation+1.46e-02

Pos logit contributions
 fault+0.009
OO+0.009
op+0.009
itten+0.009
 unnecessary+0.009
Neg logit contributions
 next-0.009
 timer-0.009
 bride-0.008
 Claus-0.008
oe-0.007
Token Mother
Token ID10584
Feature activation+1.13e-02

Pos logit contributions
itten+0.024
 fault+0.023
OO+0.023
op+0.023
 unnecessary+0.023
Neg logit contributions
 next-0.018
 Claus-0.016
 timer-0.015
 Avery-0.015
Santa-0.014
Tokenâ
Token ID22940
Feature activation+2.04e-03

Pos logit contributions
op+0.015
itten+0.014
ending+0.014
 fault+0.014
ops+0.013
Neg logit contributions
 Amen-0.016
 next-0.016
Cl-0.016
Santa-0.015
 timer-0.015
Token
Token ID26391
Feature activation+2.66e-02

Pos logit contributions
itten+0.002
OO+0.002
 unnecessary+0.002
age+0.002
inter+0.002
Neg logit contributions
 next-0.004
 Avery-0.003
 Claus-0.003
 Anna-0.003
 waiter-0.003
Token
Token ID8151
Feature activation+1.03e-01

Pos logit contributions
itten+0.029
 unnecessary+0.028
 disease+0.027
OO+0.026
 pierced+0.026
Neg logit contributions
 next-0.050
 Avery-0.044
 Claus-0.042
 Anna-0.042
oe-0.041
Tokens
Token ID82
Feature activation+4.32e-02

Pos logit contributions
itten+0.157
op+0.151
OO+0.149
 fault+0.146
 unnecessary+0.144
Neg logit contributions
 next-0.149
 Avery-0.128
 timer-0.126
 Claus-0.122
Santa-0.119
Token Day
Token ID3596
Feature activation-4.37e-02

Pos logit contributions
itten+0.054
 unnecessary+0.053
 fault+0.053
op+0.052
OO+0.052
Neg logit contributions
 next-0.071
 Avery-0.062
 timer-0.061
 Claus-0.060
Santa-0.060
Token,
Token ID11
Feature activation+1.88e-02

Pos logit contributions
 timer+0.068
 next+0.067
 Claus+0.067
 Amen+0.067
maid+0.064
Neg logit contributions
 unnecessary-0.051
op-0.051
 risks-0.051
OO-0.048
ilst-0.048
Token they
Token ID484
Feature activation+5.53e-03

Pos logit contributions
op+0.026
 muscles+0.026
 unnecessary+0.025
itten+0.025
 risks+0.025
Neg logit contributions
 timer-0.023
 next-0.023
 Amen-0.023
 Claus-0.023
oe-0.022
Token presented
Token ID5545
Feature activation-7.91e-03

Pos logit contributions
op+0.006
itten+0.006
're+0.005
 weren+0.005
aved+0.005
Neg logit contributions
 next-0.010
 timer-0.009
Santa-0.009
 Avery-0.009
Cl-0.009
Token He
Token ID679
Feature activation-9.41e-03

Pos logit contributions
 Ow+0.005
eware+0.005
itten+0.004
 Everything+0.004
 tal+0.004
Neg logit contributions
 next-0.006
 timer-0.006
oe-0.006
iza-0.006
kins-0.005
Token has
Token ID468
Feature activation+1.03e-02

Pos logit contributions
 next+0.016
 timer+0.015
 Avery+0.015
Cl+0.014
oe+0.014
Neg logit contributions
 happened-0.010
're-0.010
uch-0.010
age-0.010
 pierced-0.010
Token black
Token ID2042
Feature activation+9.93e-02

Pos logit contributions
 unnecessary+0.014
 fault+0.013
 risks+0.013
 muscles+0.013
 disease+0.013
Neg logit contributions
 next-0.016
 timer-0.015
oe-0.014
 Avery-0.014
 Amen-0.014
Token hair
Token ID4190
Feature activation+7.72e-03

Pos logit contributions
itten+0.150
OO+0.145
 fault+0.142
 unnecessary+0.142
 disease+0.141
Neg logit contributions
 next-0.140
 Avery-0.124
 timer-0.119
 Claus-0.117
iza-0.115
Token and
Token ID290
Feature activation-2.67e-03

Pos logit contributions
op+0.013
itten+0.012
age+0.012
ops+0.012
 fault+0.011
Neg logit contributions
 Avery-0.009
 Claire-0.009
 timer-0.008
Friend-0.008
iza-0.008
Token black
Token ID2042
Feature activation+1.03e-01

Pos logit contributions
 next+0.004
oe+0.004
 timer+0.004
iza+0.004
 Avery+0.004
Neg logit contributions
 muscles-0.003
 unnecessary-0.003
itten-0.003
 tal-0.003
 disease-0.003
Token eyes
Token ID2951
Feature activation+5.29e-02

Pos logit contributions
itten+0.158
 muscles+0.152
OO+0.152
 fault+0.150
 risks+0.149
Neg logit contributions
 next-0.139
 Avery-0.130
 Claus-0.121
 timer-0.121
iza-0.120
Token.
Token ID13
Feature activation+8.54e-03

Pos logit contributions
age+0.086
itten+0.083
op+0.081
 fault+0.080
 muscles+0.080
Neg logit contributions
 Avery-0.061
iza-0.058
 next-0.055
 timer-0.054
 Amber-0.054
Token He
Token ID679
Feature activation-5.37e-03

Pos logit contributions
 Ow+0.011
 muscles+0.011
eware+0.011
 Everything+0.011
itten+0.010
Neg logit contributions
 next-0.013
 timer-0.013
oe-0.013
iza-0.012
maid-0.011
Token's
Token ID338
Feature activation+3.02e-02

Pos logit contributions
 next+0.008
 timer+0.008
 Avery+0.008
oe+0.008
Cl+0.007
Neg logit contributions
 happened-0.006
itten-0.006
 pierced-0.006
 cob-0.006
age-0.006
Token very
Token ID845
Feature activation-1.64e-02

Pos logit contributions
 fault+0.042
 cob+0.039
 disease+0.039
 unnecessary+0.038
 muscles+0.038
Neg logit contributions
 next-0.045
 timer-0.044
oe-0.042
 Avery-0.041
iza-0.040
Token,
Token ID11
Feature activation+2.22e-02

Pos logit contributions
op+0.008
itten+0.008
age+0.008
OO+0.008
 unnecessary+0.008
Neg logit contributions
 timer-0.006
 next-0.006
 waiter-0.006
 Amen-0.006
 Avery-0.005
Token and
Token ID290
Feature activation+8.27e-03

Pos logit contributions
 unnecessary+0.030
itten+0.030
're+0.029
 risks+0.029
OO+0.029
Neg logit contributions
 timer-0.029
 Amen-0.029
 next-0.027
 Claus-0.027
oe-0.026
Token I
Token ID314
Feature activation+1.09e-02

Pos logit contributions
 unnecessary+0.012
 risks+0.012
itten+0.011
 disease+0.011
 muscles+0.011
Neg logit contributions
 timer-0.010
 next-0.010
oe-0.010
 Avery-0.010
 Claus-0.009
Tokenâ
Token ID22940
Feature activation+7.56e-03

Pos logit contributions
itten+0.010
op+0.010
age+0.010
OO+0.010
 fault+0.010
Neg logit contributions
 next-0.021
 timer-0.020
 Avery-0.019
Santa-0.018
 Claus-0.018
Token
Token ID26391
Feature activation+2.07e-02

Pos logit contributions
itten+0.009
OO+0.009
age+0.008
 unnecessary+0.008
 risks+0.008
Neg logit contributions
 next-0.013
 Avery-0.012
 timer-0.012
 Claus-0.011
 Anna-0.011
Token
Token ID8151
Feature activation+1.03e-01

Pos logit contributions
itten+0.020
op+0.020
OO+0.019
 unnecessary+0.018
eware+0.017
Neg logit contributions
 next-0.042
 timer-0.038
 Avery-0.037
 waiter-0.036
 kindly-0.035
Tokenll
Token ID297
Feature activation-1.54e-03

Pos logit contributions
itten+0.142
op+0.139
OO+0.131
 fault+0.123
ending+0.120
Neg logit contributions
 next-0.167
 Avery-0.153
 timer-0.150
 Claus-0.143
 lad-0.142
Token show
Token ID905
Feature activation-8.22e-03

Pos logit contributions
 next+0.003
 Avery+0.002
 waiter+0.002
 Anna+0.002
 timer+0.002
Neg logit contributions
itten-0.002
age-0.002
're-0.002
op-0.002
 risks-0.002
Token you
Token ID345
Feature activation-4.28e-04

Pos logit contributions
 timer+0.009
 next+0.009
 Claus+0.008
oe+0.007
 Amen+0.007
Neg logit contributions
 unnecessary-0.014
 risks-0.014
itten-0.014
 tal-0.013
 disease-0.013
Token around
Token ID1088
Feature activation+1.98e-02

Pos logit contributions
 timer+0.001
 next+0.001
 Claus+0.001
 Amen+0.001
 Avery+0.001
Neg logit contributions
 unnecessary-0.001
 risks-0.001
itten-0.001
 fault-0.001
OO-0.001
Token.
Token ID13
Feature activation-1.49e-03

Pos logit contributions
 risks+0.030
 unnecessary+0.030
op+0.027
 cob+0.027
 disease+0.027
Neg logit contributions
 timer-0.026
 Claus-0.025
 Amen-0.024
oe-0.023
 next-0.022
Token You
Token ID921
Feature activation+6.92e-03

Pos logit contributions
itten+0.012
 fault+0.011
eware+0.011
op+0.011
age+0.011
Neg logit contributions
 next-0.010
 timer-0.009
oe-0.008
 Avery-0.008
 kindly-0.008
Token are
Token ID389
Feature activation+3.36e-03

Pos logit contributions
itten+0.009
 unnecessary+0.009
OO+0.009
op+0.008
 disease+0.008
Neg logit contributions
 next-0.012
 Avery-0.010
 timer-0.010
 Claus-0.009
 Anna-0.009
Token a
Token ID257
Feature activation-1.38e-02

Pos logit contributions
 cob+0.004
 unnecessary+0.004
 risks+0.003
 muscles+0.003
 fault+0.003
Neg logit contributions
 timer-0.005
ory-0.005
oe-0.005
 next-0.005
bell-0.005
Token very
Token ID845
Feature activation-1.17e-02

Pos logit contributions
 next+0.022
 Avery+0.021
oe+0.021
 timer+0.021
 Amber+0.020
Neg logit contributions
 unnecessary-0.015
 disease-0.015
 cob-0.014
 risks-0.014
OO-0.014
Token good
Token ID922
Feature activation+5.25e-04

Pos logit contributions
rily+0.021
maid+0.020
 Avery+0.020
 Year+0.019
lights+0.019
Neg logit contributions
 job-0.011
op-0.010
ggy-0.009
 unnecessary-0.009
 belong-0.009
Token writer
Token ID6260
Feature activation-1.00e-01

Pos logit contributions
op+0.001
itten+0.001
 disease+0.001
 fault+0.001
OO+0.001
Neg logit contributions
 next-0.001
 timer-0.001
 Avery-0.001
 Claire-0.001
Cl-0.001
Token.
Token ID13
Feature activation+9.22e-03

Pos logit contributions
 Avery+0.127
 timer+0.123
oe+0.121
ory+0.119
ora+0.116
Neg logit contributions
op-0.139
age-0.138
 infection-0.135
itten-0.131
ops-0.124
Token Your
Token ID3406
Feature activation-1.78e-02

Pos logit contributions
itten+0.014
eware+0.013
 unnecessary+0.013
 fault+0.013
age+0.013
Neg logit contributions
 next-0.012
 timer-0.012
oe-0.011
maid-0.010
 Claus-0.010
Token uncle
Token ID7711
Feature activation-2.18e-02

Pos logit contributions
 Claire+0.024
oe+0.024
Cl+0.024
 Avery+0.024
 next+0.023
Neg logit contributions
 fault-0.027
 disease-0.025
OO-0.024
 unnecessary-0.023
 infection-0.022
Token will
Token ID481
Feature activation-2.93e-02

Pos logit contributions
 timer+0.037
 next+0.034
 Avery+0.034
rily+0.033
Cl+0.033
Neg logit contributions
 fault-0.023
op-0.022
 happened-0.022
ops-0.022
're-0.021
Token be
Token ID307
Feature activation-1.88e-03

Pos logit contributions
 Avery+0.047
 waiter+0.047
 timer+0.047
 menus+0.047
 next+0.047
Neg logit contributions
age-0.030
 disease-0.029
're-0.027
 risks-0.026
 infection-0.026
Token Larry
Token ID13633
Feature activation+3.74e-03

Pos logit contributions
itten+0.006
 unnecessary+0.006
OO+0.006
op+0.006
 disease+0.006
Neg logit contributions
 next-0.006
 Avery-0.005
 timer-0.005
 Claus-0.004
 Anna-0.004
Token was
Token ID373
Feature activation+1.26e-02

Pos logit contributions
op+0.003
itten+0.003
 pierced+0.003
age+0.003
 cob+0.003
Neg logit contributions
 timer-0.008
 Avery-0.007
 next-0.007
 waiter-0.007
 Maya-0.007
Token a
Token ID257
Feature activation-2.87e-02

Pos logit contributions
itten+0.022
 unnecessary+0.021
OO+0.021
 disease+0.021
op+0.021
Neg logit contributions
 next-0.015
 Avery-0.013
 timer-0.012
 Claus-0.012
 Anna-0.011
Token good
Token ID922
Feature activation-8.68e-03

Pos logit contributions
 next+0.036
 Avery+0.032
 timer+0.030
 menus+0.029
 Maya+0.028
Neg logit contributions
itten-0.047
OO-0.045
 disease-0.045
op-0.045
 unnecessary-0.044
Token-
Token ID12
Feature activation-1.87e-02

Pos logit contributions
 next+0.012
 Avery+0.009
 timer+0.009
 Claus+0.009
 Anna+0.009
Neg logit contributions
 unnecessary-0.014
itten-0.014
OO-0.013
 risks-0.013
 weaknesses-0.013
Tokennat
Token ID32353
Feature activation-9.96e-02

Pos logit contributions
 next+0.034
 Avery+0.030
 timer+0.029
 Claus+0.029
 Anna+0.029
Neg logit contributions
itten-0.021
 unnecessary-0.020
OO-0.020
op-0.019
 disease-0.019
Tokenured
Token ID1522
Feature activation-1.18e-02

Pos logit contributions
 next+0.167
 Avery+0.155
 timer+0.148
 Anna+0.147
 waiter+0.143
Neg logit contributions
OO-0.118
itten-0.116
 unnecessary-0.114
 disease-0.113
eware-0.110
Token bird
Token ID6512
Feature activation+1.05e-02

Pos logit contributions
 next+0.019
 Avery+0.016
 timer+0.015
 Claus+0.015
 Anna+0.015
Neg logit contributions
itten-0.016
 unnecessary-0.016
OO-0.016
op-0.015
 disease-0.015
Token and
Token ID290
Feature activation-3.48e-03

Pos logit contributions
itten+0.020
op+0.019
age+0.018
 disease+0.018
OO+0.018
Neg logit contributions
 next-0.011
 Avery-0.010
 timer-0.010
 Maya-0.009
Santa-0.009
Token always
Token ID1464
Feature activation-2.79e-02

Pos logit contributions
 next+0.005
 Avery+0.005
 timer+0.005
 Claus+0.004
oe+0.004
Neg logit contributions
itten-0.005
 unnecessary-0.005
OO-0.005
 disease-0.004
op-0.004
Token looked
Token ID3114
Feature activation+1.32e-03

Pos logit contributions
 next+0.038
 Avery+0.030
 Claus+0.030
 smiling+0.029
 Anna+0.029
Neg logit contributions
itten-0.046
 unnecessary-0.045
OO-0.044
op-0.043
 fault-0.043
Token there
Token ID612
Feature activation+2.51e-02

Pos logit contributions
itten+0.036
OO+0.035
op+0.034
 unnecessary+0.034
 disease+0.033
Neg logit contributions
 next-0.022
 Avery-0.019
 timer-0.019
 Claus-0.018
Santa-0.017
Token lived
Token ID5615
Feature activation-2.84e-02

Pos logit contributions
 unnecessary+0.065
 risks+0.062
 weaknesses+0.059
itten+0.059
 disease+0.058
Neg logit contributions
 next-0.021
 Anna-0.016
Anna-0.012
 and-0.011
 smiled-0.010
Token a
Token ID257
Feature activation-1.43e-02

Pos logit contributions
 next+0.040
 timer+0.038
 Avery+0.037
 Claus+0.036
Santa+0.034
Neg logit contributions
itten-0.038
OO-0.038
 unnecessary-0.037
op-0.037
 Ow-0.036
Token princess
Token ID21752
Feature activation-6.40e-02

Pos logit contributions
 next+0.021
 Avery+0.021
 menus+0.020
 praises+0.019
minded+0.019
Neg logit contributions
 cob-0.019
op-0.018
 spike-0.017
itten-0.017
OO-0.017
Token named
Token ID3706
Feature activation-2.89e-02

Pos logit contributions
 Avery+0.079
 timer+0.077
 next+0.072
 Claire+0.069
 Maya+0.069
Neg logit contributions
op-0.106
age-0.102
itten-0.102
 cob-0.098
OO-0.097
Token Flor
Token ID4432
Feature activation-9.92e-02

Pos logit contributions
 next+0.037
 timer+0.035
guard+0.034
 sailors+0.034
 Amen+0.033
Neg logit contributions
 cob-0.042
 Ow-0.040
OO-0.039
eware-0.038
 webs-0.038
Tokena
Token ID64
Feature activation-1.47e-02

Pos logit contributions
 timer+0.186
 next+0.180
 Amen+0.169
 smiling+0.166
 smiles+0.162
Neg logit contributions
OO-0.112
op-0.111
ending-0.091
eware-0.090
itten-0.087
Token.
Token ID13
Feature activation-2.33e-03

Pos logit contributions
 timer+0.022
 Avery+0.018
 next+0.018
 late+0.016
 pans+0.016
Neg logit contributions
op-0.022
 cob-0.022
OO-0.021
age-0.021
itten-0.020
Token She
Token ID1375
Feature activation-1.20e-02

Pos logit contributions
 next+0.002
 timer+0.002
 Claus+0.002
Santa+0.002
 Avery+0.002
Neg logit contributions
OO-0.004
 Ow-0.004
eware-0.004
itten-0.004
 cob-0.004
Token had
Token ID550
Feature activation+1.39e-02

Pos logit contributions
 timer+0.022
 next+0.022
 Avery+0.021
 menus+0.020
oe+0.020
Neg logit contributions
 cob-0.011
itten-0.011
 pierced-0.010
age-0.010
 disease-0.010
Token long
Token ID890
Feature activation+7.26e-02

Pos logit contributions
itten+0.025
OO+0.024
 unnecessary+0.024
op+0.024
 disease+0.023
Neg logit contributions
 next-0.017
 Avery-0.014
 Anna-0.013
 Maya-0.012
 Claus-0.012
Token.
Token ID13
Feature activation+5.22e-03

Pos logit contributions
itten+0.024
age+0.023
 tal+0.023
 fault+0.022
op+0.022
Neg logit contributions
 Avery-0.026
 timer-0.026
 menus-0.024
Friend-0.024
 Navy-0.023
Token You
Token ID921
Feature activation+1.57e-03

Pos logit contributions
 Ow+0.007
itten+0.007
 Everything+0.007
 fault+0.007
inter+0.007
Neg logit contributions
 next-0.008
 timer-0.007
oe-0.007
iza-0.006
 keeper-0.006
Token are
Token ID389
Feature activation-2.33e-03

Pos logit contributions
itten+0.002
 unnecessary+0.001
OO+0.001
op+0.001
 disease+0.001
Neg logit contributions
 next-0.003
 Avery-0.003
 timer-0.003
 Anna-0.003
 Claus-0.003
Token a
Token ID257
Feature activation-1.79e-02

Pos logit contributions
oe+0.004
 timer+0.004
 next+0.003
ory+0.003
guard+0.003
Neg logit contributions
 cob-0.003
 unnecessary-0.003
 muscles-0.003
 fault-0.002
 risks-0.002
Token great
Token ID1049
Feature activation-9.30e-03

Pos logit contributions
 next+0.028
 Avery+0.027
oe+0.027
 Amber+0.026
 timer+0.026
Neg logit contributions
 disease-0.021
 cob-0.021
 unnecessary-0.020
OO-0.019
 spike-0.019
Token writer
Token ID6260
Feature activation-9.91e-02

Pos logit contributions
 next+0.012
Cl+0.011
oe+0.011
Santa+0.011
 timer+0.011
Neg logit contributions
 fault-0.014
op-0.014
itten-0.014
 disease-0.014
 muscles-0.013
Token."
Token ID526
Feature activation-1.52e-02

Pos logit contributions
 Avery+0.133
 Navy+0.124
Friend+0.123
 Amber+0.121
ory+0.121
Neg logit contributions
age-0.136
op-0.136
 infection-0.134
itten-0.128
ops-0.122
Token his
Token ID465
Feature activation-1.56e-02

Pos logit contributions
 Claus+0.024
 timer+0.023
 next+0.023
May+0.023
Cl+0.023
Neg logit contributions
apy-0.018
 hamm-0.017
 tum-0.017
itten-0.017
 tal-0.017
Token mom
Token ID1995
Feature activation-8.55e-03

Pos logit contributions
 Avery+0.022
Cl+0.022
 next+0.022
May+0.021
Santa+0.021
Neg logit contributions
 fault-0.022
itten-0.021
 muscles-0.021
 cob-0.020
 tum-0.020
Token says
Token ID1139
Feature activation-5.42e-03

Pos logit contributions
mer+0.016
ider+0.016
Cl+0.015
Santa+0.015
overs+0.014
Neg logit contributions
chairs-0.007
 bars-0.006
 cob-0.006
 happened-0.006
oga-0.006
Token.
Token ID13
Feature activation+8.52e-04

Pos logit contributions
Santa+0.006
Anna+0.006
 Avery+0.006
 waiter+0.006
iza+0.006
Neg logit contributions
op-0.009
itten-0.007
ending-0.007
apy-0.007
age-0.007
Token.
Token ID13
Feature activation-4.56e-03

Pos logit contributions
oe+0.036
iza+0.035
 timer+0.034
 Avery+0.033
 lad+0.033
Neg logit contributions
op-0.037
OO-0.036
 unnecessary-0.036
 risks-0.036
 fault-0.036
Token You
Token ID921
Feature activation-6.85e-04

Pos logit contributions
 next+0.006
 timer+0.006
oe+0.006
 Claus+0.006
maid+0.005
Neg logit contributions
 unnecessary-0.006
itten-0.006
 fault-0.006
OO-0.006
 muscles-0.006
Token are
Token ID389
Feature activation-3.83e-03

Pos logit contributions
 next+0.001
 Avery+0.001
 timer+0.001
 Anna+0.001
 Claus+0.001
Neg logit contributions
itten-0.001
 unnecessary-0.001
OO-0.001
op-0.001
 disease-0.001
Token a
Token ID257
Feature activation-1.71e-02

Pos logit contributions
oe+0.006
 timer+0.006
ory+0.006
 next+0.005
maid+0.005
Neg logit contributions
 cob-0.004
 unnecessary-0.004
 risks-0.004
 muscles-0.004
 disease-0.004
Token good
Token ID922
Feature activation-9.28e-04

Pos logit contributions
 next+0.028
oe+0.026
 timer+0.026
 Avery+0.025
rily+0.024
Neg logit contributions
 cob-0.019
 disease-0.019
 unnecessary-0.019
 risks-0.017
OO-0.017
Token writer
Token ID6260
Feature activation-9.80e-02

Pos logit contributions
 next+0.001
Cl+0.001
Santa+0.001
Mr+0.001
 timer+0.001
Neg logit contributions
 fault-0.001
op-0.001
 job-0.001
 disease-0.001
 infection-0.001
Token.
Token ID13
Feature activation+7.47e-04

Pos logit contributions
 Avery+0.122
oe+0.119
Cl+0.115
 timer+0.114
ory+0.114
Neg logit contributions
age-0.137
op-0.136
 infection-0.134
itten-0.130
 fault-0.127
Token But
Token ID887
Feature activation+1.30e-04

Pos logit contributions
itten+0.001
 unnecessary+0.001
eware+0.001
 risks+0.001
OO+0.001
Neg logit contributions
 timer-0.001
 next-0.001
oe-0.001
 Claus-0.001
maid-0.001
Token you
Token ID345
Feature activation+1.73e-03

Pos logit contributions
itten+0.000
 unnecessary+0.000
OO+0.000
 disease+0.000
 fault+0.000
Neg logit contributions
 next-0.000
 Avery-0.000
 Claus-0.000
 timer-0.000
Santa-0.000
Token can
Token ID460
Feature activation-1.65e-02

Pos logit contributions
itten+0.003
 unnecessary+0.003
OO+0.003
op+0.003
 disease+0.003
Neg logit contributions
 next-0.002
 Avery-0.002
 timer-0.002
 Claus-0.002
 Anna-0.002
Token write
Token ID3551
Feature activation-3.57e-02

Pos logit contributions
 next+0.025
 timer+0.022
 Avery+0.022
 waiter+0.022
Cl+0.022
Neg logit contributions
op-0.021
itten-0.020
age-0.019
 fault-0.019
uch-0.019
Token was
Token ID373
Feature activation+1.83e-02

Pos logit contributions
 unnecessary+0.068
 risks+0.064
 weaknesses+0.062
itten+0.061
 disease+0.061
Neg logit contributions
 next-0.022
 Anna-0.017
Anna-0.013
 and-0.012
 also-0.011
Token a
Token ID257
Feature activation-1.85e-02

Pos logit contributions
itten+0.029
 unnecessary+0.028
OO+0.028
 disease+0.028
op+0.028
Neg logit contributions
 next-0.025
 Avery-0.021
 timer-0.020
 Claus-0.020
Santa-0.019
Token little
Token ID1310
Feature activation-1.33e-02

Pos logit contributions
 next+0.027
 Avery+0.027
 menus+0.025
 praises+0.024
minded+0.024
Neg logit contributions
 cob-0.024
op-0.023
itten-0.023
 spike-0.023
OO-0.023
Token girl
Token ID2576
Feature activation-2.17e-02

Pos logit contributions
 next+0.015
 Avery+0.013
Santa+0.012
 timer+0.012
Cl+0.012
Neg logit contributions
itten-0.025
op-0.024
 cob-0.023
 spike-0.023
 disease-0.022
Token named
Token ID3706
Feature activation-2.29e-02

Pos logit contributions
 Avery+0.032
 timer+0.031
 menus+0.030
 waiter+0.029
 Claus+0.029
Neg logit contributions
op-0.031
itten-0.030
 cob-0.030
age-0.028
 spike-0.026
Token Flor
Token ID4432
Feature activation-9.79e-02

Pos logit contributions
 next+0.021
 Avery+0.018
 Claus+0.018
 timer+0.017
Santa+0.017
Neg logit contributions
itten-0.044
OO-0.043
 unnecessary-0.042
op-0.042
 cob-0.042
Tokena
Token ID64
Feature activation-1.29e-02

Pos logit contributions
 timer+0.174
 next+0.171
 smiling+0.154
 smiles+0.151
 thanks+0.150
Neg logit contributions
OO-0.116
op-0.113
ending-0.095
eware-0.093
itten-0.093
Token.
Token ID13
Feature activation+9.79e-04

Pos logit contributions
 timer+0.017
 Avery+0.016
 next+0.015
 Maya+0.014
 menus+0.014
Neg logit contributions
op-0.021
OO-0.019
 cob-0.019
age-0.019
itten-0.019
Token Flor
Token ID4432
Feature activation-8.88e-02

Pos logit contributions
OO+0.002
itten+0.002
 Ow+0.002
eware+0.002
 fault+0.002
Neg logit contributions
 next-0.001
 timer-0.001
 Claus-0.001
Santa-0.001
 Avery-0.001
Tokena
Token ID64
Feature activation-9.56e-03

Pos logit contributions
 next+0.145
 timer+0.143
 smiling+0.124
Santa+0.123
 Avery+0.121
Neg logit contributions
op-0.115
OO-0.114
itten-0.102
ending-0.099
eware-0.098
Token was
Token ID373
Feature activation+8.43e-03

Pos logit contributions
 timer+0.018
 Avery+0.017
 next+0.017
 menus+0.016
 Amber+0.016
Neg logit contributions
op-0.009
age-0.008
itten-0.008
 cob-0.008
OO-0.008
Token!
Token ID0
Feature activation+2.91e-02

Pos logit contributions
 timer+0.017
 Avery+0.016
iza+0.015
Cl+0.015
 next+0.015
Neg logit contributions
 tal-0.012
age-0.011
OO-0.011
itten-0.011
 fault-0.011
Token You
Token ID921
Feature activation+6.29e-03

Pos logit contributions
itten+0.044
 fault+0.044
OO+0.044
op+0.043
 Ow+0.043
Neg logit contributions
 next-0.041
 timer-0.038
oe-0.033
 kindly-0.033
 keeper-0.033
Token are
Token ID389
Feature activation+2.73e-04

Pos logit contributions
itten+0.009
 unnecessary+0.009
OO+0.008
op+0.008
 disease+0.008
Neg logit contributions
 next-0.010
 Avery-0.009
 Claus-0.008
 timer-0.008
 Anna-0.008
Token a
Token ID257
Feature activation-1.75e-02

Pos logit contributions
 unnecessary+0.000
 cob+0.000
 risks+0.000
 muscles+0.000
 fault+0.000
Neg logit contributions
 next-0.000
 timer-0.000
oe-0.000
 Avery-0.000
Cl-0.000
Token great
Token ID1049
Feature activation-6.72e-03

Pos logit contributions
 next+0.027
 Avery+0.025
oe+0.025
 Amber+0.024
 timer+0.024
Neg logit contributions
 disease-0.020
 cob-0.020
 unnecessary-0.020
op-0.019
 spike-0.019
Token writer
Token ID6260
Feature activation-9.75e-02

Pos logit contributions
 next+0.009
Cl+0.007
 timer+0.007
Santa+0.007
 Avery+0.007
Neg logit contributions
op-0.011
itten-0.011
 fault-0.010
 disease-0.010
 muscles-0.010
Token!"
Token ID2474
Feature activation-2.90e-03

Pos logit contributions
 Avery+0.131
 Navy+0.127
Cl+0.126
Friend+0.123
iza+0.122
Neg logit contributions
age-0.132
itten-0.131
op-0.129
 infection-0.126
 fault-0.120
Token she
Token ID673
Feature activation-1.38e-02

Pos logit contributions
 Claus+0.004
 timer+0.004
 next+0.004
oe+0.004
May+0.004
Neg logit contributions
 tum-0.004
itten-0.004
apy-0.004
 muscles-0.004
op-0.004
Token says
Token ID1139
Feature activation+4.09e-03

Pos logit contributions
 next+0.020
iza+0.018
oe+0.018
anut+0.018
ory+0.017
Neg logit contributions
 happened-0.017
 cob-0.017
 fault-0.017
 muscles-0.016
 pierced-0.016
Token.
Token ID13
Feature activation+7.12e-03

Pos logit contributions
op+0.007
apy+0.006
ending+0.006
 muscles+0.006
itten+0.006
Neg logit contributions
Anna-0.005
iza-0.005
Santa-0.005
 waiter-0.005
Cl-0.005
Token
Token ID198
Feature activation+1.30e-03

Pos logit contributions
op+0.011
itten+0.011
OO+0.010
 muscles+0.010
 Ow+0.010
Neg logit contributions
 next-0.010
 timer-0.009
 Claus-0.008
oe-0.008
Santa-0.008
Token "
Token ID366
Feature activation+9.49e-03

Pos logit contributions
op+0.007
OO+0.007
itten+0.007
apy+0.006
 Everything+0.006
Neg logit contributions
 next-0.014
oe-0.014
guard-0.014
iza-0.014
ory-0.013
TokenYou
Token ID1639
Feature activation+6.40e-03

Pos logit contributions
itten+0.014
age+0.013
ending+0.013
op+0.012
checked+0.012
Neg logit contributions
 next-0.013
 Claus-0.012
 timer-0.012
maid-0.012
 Avery-0.011
Token are
Token ID389
Feature activation-5.53e-03

Pos logit contributions
itten+0.008
OO+0.008
 unnecessary+0.008
op+0.008
 fault+0.007
Neg logit contributions
 next-0.011
 Avery-0.009
 timer-0.009
 Anna-0.009
Santa-0.009
Token a
Token ID257
Feature activation-2.08e-02

Pos logit contributions
 next+0.008
 timer+0.007
oe+0.007
 Claus+0.007
Santa+0.007
Neg logit contributions
 cob-0.007
 unnecessary-0.006
itten-0.006
 hamm-0.006
 muscles-0.006
Token gifted
Token ID22527
Feature activation-3.80e-02

Pos logit contributions
 next+0.033
oe+0.030
 Claus+0.030
 timer+0.030
 Avery+0.030
Neg logit contributions
 cob-0.023
op-0.022
 disease-0.022
 spike-0.021
OO-0.021
Token writer
Token ID6260
Feature activation-9.65e-02

Pos logit contributions
Santa+0.047
 next+0.045
 Claus+0.044
 Amen+0.043
Say+0.043
Neg logit contributions
op-0.058
 fault-0.057
itten-0.056
 cob-0.056
 muscles-0.056
Token,
Token ID11
Feature activation+4.03e-03

Pos logit contributions
Say+0.119
 timer+0.116
 Avery+0.115
 Navy+0.115
Cl+0.114
Neg logit contributions
op-0.135
itten-0.132
age-0.132
 fault-0.127
 infection-0.126
Token Lily
Token ID20037
Feature activation-1.19e-02

Pos logit contributions
 cob+0.007
itten+0.006
 fault+0.006
 spike+0.006
 disease+0.006
Neg logit contributions
 Claus-0.005
 next-0.004
 Avery-0.004
 Amen-0.004
oe-0.004
Token.
Token ID13
Feature activation+9.70e-03

Pos logit contributions
 timer+0.017
 Avery+0.016
iza+0.016
 Navy+0.015
Say+0.015
Neg logit contributions
op-0.015
 fault-0.015
itten-0.015
 cob-0.015
age-0.015
Token You
Token ID921
Feature activation+6.05e-03

Pos logit contributions
 Everything+0.013
eware+0.013
op+0.013
itten+0.012
inter+0.012
Neg logit contributions
 next-0.014
 timer-0.014
 keeper-0.013
 Claus-0.013
oe-0.012
Token write
Token ID3551
Feature activation-3.51e-02

Pos logit contributions
itten+0.008
OO+0.007
 unnecessary+0.007
op+0.007
 disease+0.007
Neg logit contributions
 next-0.010
 Avery-0.009
 timer-0.009
 Anna-0.008
 Claus-0.008
Token.
Token ID13
Feature activation+8.61e-03

Pos logit contributions
 timer+0.052
 Avery+0.052
 Claus+0.050
oe+0.049
iza+0.049
Neg logit contributions
 unnecessary-0.077
 risks-0.077
 endings-0.073
 fault-0.072
 cob-0.071
Token You
Token ID921
Feature activation+1.36e-02

Pos logit contributions
 Everything+0.012
 Ow+0.012
 muscles+0.011
itten+0.011
eware+0.011
Neg logit contributions
 timer-0.012
 next-0.011
oe-0.011
maid-0.010
iza-0.010
Token are
Token ID389
Feature activation+1.28e-02

Pos logit contributions
itten+0.016
 unnecessary+0.016
OO+0.016
op+0.015
 disease+0.015
Neg logit contributions
 next-0.024
 Avery-0.021
 timer-0.020
 Anna-0.020
 Claus-0.020
Token a
Token ID257
Feature activation-7.86e-03

Pos logit contributions
 cob+0.014
 unnecessary+0.013
 muscles+0.013
 belong+0.013
 risks+0.012
Neg logit contributions
 timer-0.020
ory-0.020
oe-0.019
bell-0.019
ara-0.018
Token wonderful
Token ID7932
Feature activation-4.70e-03

Pos logit contributions
 Avery+0.013
 next+0.013
oe+0.013
 timer+0.012
 Amber+0.012
Neg logit contributions
 cob-0.008
 unnecessary-0.008
 disease-0.007
 risks-0.007
 job-0.007
Token writer
Token ID6260
Feature activation-9.62e-02

Pos logit contributions
 Avery+0.007
Mr+0.006
Say+0.006
 next+0.006
Cl+0.006
Neg logit contributions
 disease-0.006
 job-0.006
 cob-0.006
 fault-0.006
itten-0.006
Token.
Token ID13
Feature activation+6.49e-03

Pos logit contributions
 Avery+0.114
Say+0.109
oe+0.107
 timer+0.107
 pans+0.104
Neg logit contributions
age-0.135
op-0.132
 infection-0.129
itten-0.126
 fault-0.125
Token You
Token ID921
Feature activation+1.23e-02

Pos logit contributions
 Everything+0.009
itten+0.009
 Ow+0.008
eware+0.008
age+0.008
Neg logit contributions
 timer-0.009
 next-0.009
oe-0.008
maid-0.008
ory-0.008
Token are
Token ID389
Feature activation+1.28e-02

Pos logit contributions
itten+0.013
 unnecessary+0.013
OO+0.013
op+0.013
 disease+0.013
Neg logit contributions
 next-0.023
 Avery-0.020
 timer-0.019
 Anna-0.019
 Claus-0.019
Token a
Token ID257
Feature activation-7.01e-03

Pos logit contributions
 cob+0.014
 unnecessary+0.013
 muscles+0.012
 belong+0.012
 job+0.012
Neg logit contributions
 timer-0.020
ory-0.020
oe-0.019
bell-0.019
ider-0.018
Token wonderful
Token ID7932
Feature activation-3.59e-03

Pos logit contributions
 Avery+0.011
 next+0.011
oe+0.011
 timer+0.011
ider+0.011
Neg logit contributions
 cob-0.007
 disease-0.007
 unnecessary-0.007
 job-0.006
 fault-0.006
Token you
Token ID345
Feature activation+2.38e-03

Pos logit contributions
itten+0.045
 unnecessary+0.045
 disease+0.045
 fault+0.044
OO+0.044
Neg logit contributions
 next-0.039
 Avery-0.034
 timer-0.034
 Claus-0.033
oe-0.033
Token're
Token ID821
Feature activation+2.26e-02

Pos logit contributions
itten+0.004
 unnecessary+0.004
 disease+0.003
OO+0.003
 fault+0.003
Neg logit contributions
 next-0.004
 Avery-0.003
 Claus-0.003
 Anna-0.003
 Navy-0.003
Token such
Token ID884
Feature activation+1.07e-02

Pos logit contributions
 fault+0.029
 cob+0.029
 unnecessary+0.029
OO+0.027
 disease+0.027
Neg logit contributions
 next-0.032
 timer-0.031
Anna-0.029
Cl-0.029
oe-0.029
Token a
Token ID257
Feature activation-9.87e-03

Pos logit contributions
itten+0.020
op+0.019
OO+0.019
 unnecessary+0.018
 disease+0.018
Neg logit contributions
 next-0.013
 Avery-0.010
 Anna-0.010
 Claire-0.009
 kindly-0.009
Token good
Token ID922
Feature activation-9.70e-04

Pos logit contributions
 Avery+0.016
 next+0.016
 Amber+0.015
oe+0.015
iza+0.015
Neg logit contributions
 unnecessary-0.011
 disease-0.010
 cob-0.010
 spike-0.010
OO-0.010
Token writer
Token ID6260
Feature activation-9.61e-02

Pos logit contributions
 next+0.002
 Avery+0.001
 timer+0.001
Santa+0.001
Cl+0.001
Neg logit contributions
 fault-0.001
 disease-0.001
itten-0.001
OO-0.001
age-0.001
Token!
Token ID0
Feature activation+1.87e-02

Pos logit contributions
 Avery+0.110
 Navy+0.101
Say+0.101
Cl+0.100
 next+0.099
Neg logit contributions
age-0.151
op-0.147
itten-0.145
 fault-0.143
 infection-0.143
Token I
Token ID314
Feature activation-5.35e-03

Pos logit contributions
itten+0.028
op+0.027
eware+0.026
OO+0.026
ending+0.026
Neg logit contributions
 next-0.026
 timer-0.025
 menus-0.022
 Claus-0.022
oe-0.022
Token wish
Token ID4601
Feature activation-3.52e-02

Pos logit contributions
 next+0.010
 Avery+0.009
 timer+0.009
 Anna+0.009
 Maya+0.008
Neg logit contributions
itten-0.005
 fault-0.005
op-0.005
age-0.005
 disease-0.005
Token I
Token ID314
Feature activation-2.96e-02

Pos logit contributions
 next+0.035
 timer+0.035
 Claus+0.034
Cl+0.033
Santa+0.032
Neg logit contributions
 unnecessary-0.058
 fault-0.058
OO-0.057
itten-0.056
 cob-0.056
Token could
Token ID714
Feature activation-4.05e-03

Pos logit contributions
 next+0.047
 Avery+0.044
 timer+0.044
 Amber+0.041
 menus+0.041
Neg logit contributions
 pierced-0.034
itten-0.034
op-0.033
eware-0.033
OO-0.033
Token came
Token ID1625
Feature activation+2.33e-04

Pos logit contributions
 Amen+0.009
 next+0.009
 Avery+0.009
Cl+0.008
Santa+0.008
Neg logit contributions
op-0.007
itten-0.007
OO-0.006
ilst-0.006
 pierced-0.006
Token home
Token ID1363
Feature activation+1.57e-03

Pos logit contributions
op+0.000
 unnecessary+0.000
itten+0.000
OO+0.000
 fault+0.000
Neg logit contributions
 timer-0.000
Santa-0.000
 Claus-0.000
 waiter-0.000
 Amen-0.000
Token she
Token ID673
Feature activation-2.42e-02

Pos logit contributions
op+0.002
 cob+0.002
 disease+0.002
 injury+0.002
itten+0.002
Neg logit contributions
 Amen-0.002
Santa-0.002
 menus-0.002
 timer-0.002
Say-0.002
Token asked
Token ID1965
Feature activation-2.82e-04

Pos logit contributions
 next+0.042
 Avery+0.041
Cl+0.041
Santa+0.040
 Amber+0.039
Neg logit contributions
 caused-0.025
 happened-0.024
op-0.024
 pierced-0.024
itten-0.023
Token if
Token ID611
Feature activation-2.58e-02

Pos logit contributions
 timer+0.000
 next+0.000
 Claus+0.000
 Amen+0.000
Santa+0.000
Neg logit contributions
OO-0.000
 unnecessary-0.000
 disease-0.000
eware-0.000
 Ow-0.000
Token Flor
Token ID4432
Feature activation-9.58e-02

Pos logit contributions
 timer+0.037
 next+0.035
 Amen+0.033
 Claus+0.032
 menus+0.032
Neg logit contributions
 unnecessary-0.034
eware-0.033
 disease-0.032
 happened-0.032
 risks-0.031
Tokena
Token ID64
Feature activation-2.59e-02

Pos logit contributions
 timer+0.169
 next+0.168
 smiling+0.153
 Amen+0.151
Santa+0.143
Neg logit contributions
op-0.127
OO-0.119
ending-0.113
 unnecessary-0.113
eware-0.109
Token had
Token ID550
Feature activation-6.35e-03

Pos logit contributions
 next+0.042
 timer+0.039
Santa+0.038
 Avery+0.038
 Amen+0.037
Neg logit contributions
op-0.031
itten-0.031
 fault-0.030
 disease-0.030
OO-0.030
Token washed
Token ID18989
Feature activation+3.53e-03

Pos logit contributions
 next+0.010
 Avery+0.009
 Anna+0.008
 Claire+0.008
 timer+0.008
Neg logit contributions
itten-0.009
OO-0.009
op-0.009
 unnecessary-0.008
 disease-0.008
Token her
Token ID607
Feature activation-4.87e-03

Pos logit contributions
itten+0.005
 unnecessary+0.005
 risks+0.005
 disease+0.005
 fault+0.004
Neg logit contributions
 next-0.005
 timer-0.005
 Claus-0.005
 Amen-0.005
 Avery-0.004
Token toys
Token ID14958
Feature activation-1.90e-02

Pos logit contributions
 next+0.008
 Avery+0.007
 timer+0.007
Santa+0.007
 Claus+0.006
Neg logit contributions
itten-0.006
 unnecessary-0.006
 fault-0.006
 disease-0.006
op-0.006
Token was
Token ID373
Feature activation+2.04e-02

Pos logit contributions
 unnecessary+0.066
 risks+0.062
itten+0.060
 disease+0.059
ending+0.059
Neg logit contributions
 next-0.023
 Anna-0.018
 smiled-0.013
Anna-0.013
 smiling-0.012
Token a
Token ID257
Feature activation-1.70e-02

Pos logit contributions
itten+0.033
 unnecessary+0.032
OO+0.032
 disease+0.031
op+0.031
Neg logit contributions
 next-0.027
 Avery-0.023
 timer-0.022
 Claus-0.022
Santa-0.021
Token little
Token ID1310
Feature activation-8.63e-03

Pos logit contributions
 Avery+0.025
 next+0.025
 menus+0.024
minded+0.023
rily+0.023
Neg logit contributions
 cob-0.022
op-0.021
 spike-0.020
itten-0.020
OO-0.020
Token girl
Token ID2576
Feature activation-1.94e-02

Pos logit contributions
 next+0.010
 Avery+0.009
Cl+0.008
 timer+0.008
Santa+0.008
Neg logit contributions
itten-0.016
op-0.015
 cob-0.015
 spike-0.015
 disease-0.014
Token named
Token ID3706
Feature activation-2.95e-02

Pos logit contributions
 Avery+0.026
 timer+0.025
 Claus+0.024
 next+0.024
 menus+0.024
Neg logit contributions
op-0.029
itten-0.029
 cob-0.028
age-0.027
 disease-0.026
Token Beat
Token ID12568
Feature activation-9.58e-02

Pos logit contributions
 next+0.027
 Avery+0.021
 Claus+0.021
 timer+0.021
Santa+0.020
Neg logit contributions
itten-0.058
OO-0.057
 unnecessary-0.056
op-0.056
eware-0.055
Tokenrice
Token ID20970
Feature activation-1.49e-02

Pos logit contributions
 next+0.120
 timer+0.114
 waiter+0.106
 Avery+0.106
 lad+0.103
Neg logit contributions
eware-0.151
OO-0.149
itten-0.148
 unnecessary-0.145
op-0.142
Token.
Token ID13
Feature activation+1.11e-02

Pos logit contributions
 timer+0.017
 Avery+0.016
 next+0.015
 waiter+0.014
 Claus+0.014
Neg logit contributions
 cob-0.026
itten-0.025
op-0.025
age-0.024
OO-0.024
Token She
Token ID1375
Feature activation-5.81e-03

Pos logit contributions
itten+0.021
OO+0.020
 unnecessary+0.020
op+0.020
 disease+0.020
Neg logit contributions
 next-0.013
 Avery-0.010
 Anna-0.010
 Maya-0.009
oe-0.009
Token was
Token ID373
Feature activation+1.23e-02

Pos logit contributions
 next+0.008
 Avery+0.007
 timer+0.007
 Claus+0.007
Santa+0.007
Neg logit contributions
itten-0.009
OO-0.008
op-0.008
 unnecessary-0.008
 disease-0.008
Token so
Token ID523
Feature activation-2.24e-03

Pos logit contributions
itten+0.021
 unnecessary+0.020
OO+0.020
 disease+0.020
op+0.020
Neg logit contributions
 next-0.015
 Avery-0.013
 timer-0.012
 Claus-0.012
Santa-0.012
Token!
Token ID0
Feature activation+1.87e-02

Pos logit contributions
itten+0.014
op+0.014
 cob+0.013
age+0.013
 fault+0.013
Neg logit contributions
 next-0.013
 Avery-0.013
Santa-0.013
Cl-0.012
iza-0.012
Token You
Token ID921
Feature activation+7.47e-03

Pos logit contributions
itten+0.025
op+0.025
eware+0.024
 fault+0.024
ending+0.024
Neg logit contributions
 next-0.029
 timer-0.027
 menus-0.024
 Claus-0.024
oe-0.024
Token're
Token ID821
Feature activation+2.62e-02

Pos logit contributions
itten+0.010
 unnecessary+0.010
OO+0.010
 disease+0.010
op+0.010
Neg logit contributions
 next-0.012
 Avery-0.010
 Claus-0.010
 timer-0.010
 Anna-0.010
Token a
Token ID257
Feature activation-1.73e-02

Pos logit contributions
 cob+0.034
 unnecessary+0.034
 fault+0.033
 disease+0.032
itten+0.032
Neg logit contributions
 next-0.036
 timer-0.034
oe-0.033
Anna-0.032
 Avery-0.032
Token good
Token ID922
Feature activation-5.49e-05

Pos logit contributions
 next+0.026
 Avery+0.025
oe+0.025
 timer+0.025
 Anna+0.024
Neg logit contributions
 cob-0.020
 spike-0.019
 disease-0.019
OO-0.019
 unnecessary-0.019
Token writer
Token ID6260
Feature activation-9.53e-02

Pos logit contributions
 next+0.000
 timer+0.000
Santa+0.000
Cl+0.000
 Avery+0.000
Neg logit contributions
 fault-0.000
itten-0.000
 disease-0.000
OO-0.000
op-0.000
Token."
Token ID526
Feature activation-1.01e-02

Pos logit contributions
 Avery+0.108
Say+0.107
 next+0.104
 Navy+0.102
 timer+0.102
Neg logit contributions
age-0.145
op-0.144
itten-0.142
 fault-0.139
 infection-0.138
Token Lily
Token ID20037
Feature activation-1.50e-02

Pos logit contributions
 next+0.014
 sailor+0.013
 Claus+0.013
 priest+0.012
 timer+0.012
Neg logit contributions
op-0.015
inter-0.014
 Ow-0.014
itten-0.014
OO-0.014
Token felt
Token ID2936
Feature activation-5.30e-03

Pos logit contributions
 next+0.018
 Avery+0.015
Santa+0.015
 timer+0.015
 Amen+0.014
Neg logit contributions
itten-0.024
OO-0.024
op-0.024
 pierced-0.023
 tal-0.023
Token happy
Token ID3772
Feature activation-4.61e-03

Pos logit contributions
 next+0.005
 Avery+0.004
 timer+0.004
 Claus+0.004
oe+0.004
Neg logit contributions
itten-0.010
 unnecessary-0.010
 disease-0.010
OO-0.010
 fault-0.010
Token and
Token ID290
Feature activation-1.63e-02

Pos logit contributions
 next+0.005
 timer+0.005
 Claus+0.005
 Avery+0.005
Cl+0.005
Neg logit contributions
 fault-0.007
 disease-0.007
itten-0.007
 cob-0.007
op-0.007
Token And
Token ID843
Feature activation-4.88e-02

Pos logit contributions
itten+0.007
eware+0.007
 unnecessary+0.007
OO+0.006
inter+0.006
Neg logit contributions
 next-0.006
 timer-0.006
 Claus-0.006
oe-0.005
maid-0.005
Token you
Token ID345
Feature activation+1.89e-04

Pos logit contributions
 timer+0.052
 Claus+0.050
oe+0.048
Santa+0.047
 Avery+0.047
Neg logit contributions
 unnecessary-0.078
 risks-0.075
itten-0.075
 cob-0.075
 weaknesses-0.074
Token are
Token ID389
Feature activation+9.33e-03

Pos logit contributions
itten+0.000
 unnecessary+0.000
OO+0.000
op+0.000
 disease+0.000
Neg logit contributions
 next-0.000
 Avery-0.000
 Claus-0.000
 timer-0.000
 Anna-0.000
Token a
Token ID257
Feature activation-1.39e-02

Pos logit contributions
 cob+0.011
 unnecessary+0.010
 disease+0.009
 muscles+0.009
 risks+0.009
Neg logit contributions
ory-0.015
 timer-0.015
oe-0.014
bell-0.014
mer-0.014
Token good
Token ID922
Feature activation+1.30e-03

Pos logit contributions
 next+0.023
 Avery+0.023
oe+0.021
 Amber+0.021
rily+0.021
Neg logit contributions
 cob-0.015
 disease-0.014
 unnecessary-0.013
op-0.013
 spike-0.013
Token writer
Token ID6260
Feature activation-9.45e-02

Pos logit contributions
op+0.002
itten+0.002
 fault+0.002
 disease+0.002
OO+0.002
Neg logit contributions
 next-0.002
Santa-0.002
Cl-0.002
 Avery-0.002
Mr-0.002
Token.
Token ID13
Feature activation+4.84e-03

Pos logit contributions
 Avery+0.116
ory+0.111
 Navy+0.111
Say+0.105
oe+0.103
Neg logit contributions
age-0.135
op-0.132
 infection-0.130
 fault-0.127
itten-0.124
Token But
Token ID887
Feature activation+1.19e-04

Pos logit contributions
itten+0.007
eware+0.006
 Everything+0.006
 unnecessary+0.006
inter+0.006
Neg logit contributions
 next-0.007
 timer-0.007
oe-0.006
 Claus-0.006
maid-0.006
Token next
Token ID1306
Feature activation-3.51e-02

Pos logit contributions
itten+0.000
 unnecessary+0.000
OO+0.000
 disease+0.000
 fault+0.000
Neg logit contributions
 next-0.000
 Avery-0.000
 Claus-0.000
 timer-0.000
Santa-0.000
Token time
Token ID640
Feature activation-2.08e-03

Pos logit contributions
 next+0.064
 Avery+0.060
 Claus+0.059
 timer+0.056
Santa+0.056
Neg logit contributions
itten-0.037
OO-0.036
 disease-0.035
 unnecessary-0.034
 fault-0.034
Token,
Token ID11
Feature activation+2.12e-02

Pos logit contributions
 timer+0.004
Santa+0.003
Cl+0.003
 Avery+0.003
iza+0.003
Neg logit contributions
op-0.002
eware-0.002
OO-0.002
itten-0.002
 fault-0.002
Token,
Token ID11
Feature activation+2.58e-02

Pos logit contributions
itten+0.014
 unnecessary+0.013
OO+0.013
 disease+0.013
 fault+0.013
Neg logit contributions
 next-0.011
 Avery-0.009
 timer-0.008
 Anna-0.008
 Claus-0.008
Token I
Token ID314
Feature activation-3.69e-03

Pos logit contributions
itten+0.049
 unnecessary+0.048
 fault+0.047
 cob+0.047
 disease+0.047
Neg logit contributions
 next-0.024
 timer-0.021
 Avery-0.019
 Claus-0.019
oe-0.019
Token am
Token ID716
Feature activation-2.96e-02

Pos logit contributions
 next+0.006
 Avery+0.005
 Anna+0.005
 Claus+0.005
 timer+0.005
Neg logit contributions
itten-0.005
 unnecessary-0.005
OO-0.005
 disease-0.005
 fault-0.005
Token a
Token ID257
Feature activation-2.93e-02

Pos logit contributions
 next+0.045
 timer+0.045
Anna+0.041
guard+0.041
Say+0.041
Neg logit contributions
 cob-0.037
 unnecessary-0.035
 muscles-0.034
 fault-0.033
 risks-0.032
Token brave
Token ID14802
Feature activation-2.00e-02

Pos logit contributions
 next+0.038
 menus+0.037
 Claus+0.036
oe+0.035
Say+0.035
Neg logit contributions
 cob-0.044
 spike-0.042
 disease-0.038
 bal-0.038
 web-0.037
Token astronaut
Token ID33779
Feature activation-9.35e-02

Pos logit contributions
 next+0.026
Say+0.025
Mr+0.025
Friend+0.025
anut+0.024
Neg logit contributions
 cob-0.029
itten-0.028
 disease-0.028
OO-0.027
 spike-0.027
Token!"
Token ID2474
Feature activation+5.93e-03

Pos logit contributions
 Avery+0.139
Say+0.139
Friend+0.135
Cl+0.133
 Navy+0.131
Neg logit contributions
itten-0.117
age-0.116
 fault-0.112
 tal-0.110
op-0.110
Token Ben
Token ID3932
Feature activation-1.62e-02

Pos logit contributions
itten+0.010
op+0.009
 unnecessary+0.009
 muscles+0.009
 disease+0.009
Neg logit contributions
 next-0.007
 timer-0.006
 Claus-0.006
 Amen-0.006
Santa-0.006
Token said
Token ID531
Feature activation+7.30e-03

Pos logit contributions
 next+0.021
 Avery+0.018
 Amen+0.018
Santa+0.017
 timer+0.017
Neg logit contributions
itten-0.025
op-0.025
 pierced-0.024
 tal-0.023
eware-0.023
Token.
Token ID13
Feature activation+5.74e-03

Pos logit contributions
op+0.011
apy+0.009
inter+0.009
 tum+0.009
 fault+0.009
Neg logit contributions
Santa-0.010
May-0.009
Anna-0.009
Mr-0.009
Say-0.009
Token He
Token ID679
Feature activation-1.26e-02

Pos logit contributions
op+0.006
apy+0.005
 muscles+0.005
 pierced+0.005
 tum+0.005
Neg logit contributions
iza-0.010
anut-0.010
 Claus-0.010
 timer-0.010
 Avery-0.010
Token Mum
Token ID38367
Feature activation+2.04e-02

Pos logit contributions
OO+0.010
itten+0.010
 fault+0.009
op+0.009
 disease+0.009
Neg logit contributions
 next-0.008
 timer-0.007
 Claus-0.007
 Avery-0.007
Santa-0.007
Token was
Token ID373
Feature activation+1.38e-02

Pos logit contributions
op+0.030
itten+0.030
 cob+0.027
 pierced+0.027
ilst+0.027
Neg logit contributions
 next-0.027
Cl-0.025
 Avery-0.025
 Amen-0.024
Santa-0.024
Token very
Token ID845
Feature activation-1.04e-02

Pos logit contributions
itten+0.023
 disease+0.022
 unnecessary+0.022
 fault+0.022
op+0.022
Neg logit contributions
 next-0.016
 Avery-0.014
 Claus-0.014
 timer-0.013
Santa-0.013
Token pleased
Token ID10607
Feature activation-1.10e-02

Pos logit contributions
 next+0.012
 Avery+0.012
 Claus+0.012
 menus+0.011
 Year+0.011
Neg logit contributions
op-0.016
apy-0.015
itten-0.015
 disease-0.015
ending-0.015
Token when
Token ID618
Feature activation-3.74e-02

Pos logit contributions
 timer+0.015
 Claus+0.014
Santa+0.014
 Avery+0.014
 Amen+0.014
Neg logit contributions
op-0.015
itten-0.015
 fault-0.014
ending-0.014
 disease-0.014
Token Flor
Token ID4432
Feature activation-9.28e-02

Pos logit contributions
 timer+0.070
 Claus+0.063
 Amen+0.062
 menus+0.061
rant+0.058
Neg logit contributions
 cob-0.038
 unnecessary-0.035
apy-0.033
 happened-0.033
 mattered-0.032
Tokena
Token ID64
Feature activation-2.63e-02

Pos logit contributions
 next+0.170
 timer+0.170
 Amen+0.167
Santa+0.154
 smiling+0.152
Neg logit contributions
op-0.111
OO-0.105
ending-0.094
eware-0.092
 unnecessary-0.091
Token showed
Token ID3751
Feature activation-1.44e-02

Pos logit contributions
 Amen+0.040
 next+0.039
Santa+0.037
 Avery+0.036
 timer+0.035
Neg logit contributions
op-0.034
itten-0.031
ops-0.031
 happened-0.031
 pierced-0.031
Token her
Token ID607
Feature activation+1.65e-03

Pos logit contributions
 next+0.020
 timer+0.019
 Claus+0.018
Santa+0.017
 Amen+0.017
Neg logit contributions
 unnecessary-0.021
 fault-0.020
itten-0.020
 risks-0.020
 disease-0.020
Token the
Token ID262
Feature activation-7.47e-04

Pos logit contributions
itten+0.002
 unnecessary+0.002
OO+0.002
 disease+0.002
op+0.002
Neg logit contributions
 next-0.002
 Avery-0.002
 timer-0.002
 Claus-0.002
 Anna-0.002
Token clean
Token ID3424
Feature activation-1.62e-02

Pos logit contributions
 next+0.001
 Avery+0.001
 timer+0.001
 Claus+0.001
 Anna+0.001
Neg logit contributions
itten-0.001
 unnecessary-0.001
OO-0.001
op-0.001
 disease-0.001
Token,
Token ID11
Feature activation+2.05e-02

Pos logit contributions
 next+0.009
 Avery+0.008
Santa+0.008
Cl+0.008
 timer+0.008
Neg logit contributions
op-0.007
itten-0.007
OO-0.006
 fault-0.006
 disease-0.006
Token can
Token ID460
Feature activation-1.51e-02

Pos logit contributions
itten+0.035
 unnecessary+0.034
OO+0.034
op+0.034
 fault+0.034
Neg logit contributions
 next-0.024
 Avery-0.020
 timer-0.019
 Claus-0.019
oe-0.018
Token I
Token ID314
Feature activation-1.82e-02

Pos logit contributions
 next+0.014
 Avery+0.012
 timer+0.012
 menus+0.012
 waiter+0.012
Neg logit contributions
itten-0.028
op-0.028
 spike-0.027
uch-0.026
apy-0.026
Token be
Token ID307
Feature activation-8.01e-03

Pos logit contributions
 next+0.030
 Avery+0.030
 Amber+0.028
Santa+0.028
 Maya+0.028
Neg logit contributions
op-0.021
itten-0.021
're-0.020
age-0.020
inter-0.019
Token an
Token ID281
Feature activation+8.47e-03

Pos logit contributions
 Claus+0.012
 next+0.011
 Avery+0.011
 keeper+0.010
guard+0.010
Neg logit contributions
 unnecessary-0.011
 muscles-0.010
 disease-0.010
 risks-0.010
 fault-0.010
Token astronaut
Token ID33779
Feature activation-9.28e-02

Pos logit contributions
itten+0.011
OO+0.010
 unnecessary+0.010
 disease+0.010
 fault+0.010
Neg logit contributions
 next-0.014
 Avery-0.012
 Claus-0.012
 timer-0.012
oe-0.012
Token and
Token ID290
Feature activation-5.81e-03

Pos logit contributions
Santa+0.114
Cl+0.113
Friend+0.112
 Navy+0.112
 Avery+0.112
Neg logit contributions
age-0.130
itten-0.129
op-0.127
 infection-0.126
 fault-0.123
Token fly
Token ID6129
Feature activation-1.89e-02

Pos logit contributions
 next+0.007
oe+0.007
 timer+0.006
Cl+0.006
ory+0.006
Neg logit contributions
 unnecessary-0.008
 cob-0.008
itten-0.008
 pierced-0.008
 disease-0.008
Token to
Token ID284
Feature activation+1.53e-02

Pos logit contributions
 timer+0.027
 Claus+0.026
 waiter+0.026
 next+0.025
 Avery+0.025
Neg logit contributions
 risks-0.024
 unnecessary-0.024
 webs-0.023
 cob-0.023
OO-0.023
Token space
Token ID2272
Feature activation-4.71e-02

Pos logit contributions
 unnecessary+0.024
OO+0.022
 cob+0.022
 risks+0.022
 disease+0.022
Neg logit contributions
 next-0.020
 Claus-0.017
 timer-0.017
oe-0.017
 Avery-0.017
Token someday
Token ID25580
Feature activation-6.41e-02

Pos logit contributions
 timer+0.057
Friend+0.057
ory+0.056
 Claus+0.056
Mr+0.052
Neg logit contributions
 cob-0.066
 fault-0.063
 risks-0.062
 webs-0.061
op-0.060
Token time
Token ID640
Feature activation-1.55e-02

Pos logit contributions
itten+0.011
OO+0.010
 unnecessary+0.010
 disease+0.010
op+0.010
Neg logit contributions
 next-0.015
 Avery-0.013
 timer-0.013
 Claus-0.013
 Anna-0.012
Token of
Token ID286
Feature activation-1.59e-02

Pos logit contributions
 next+0.020
 Avery+0.017
 timer+0.017
 Claus+0.016
Santa+0.015
Neg logit contributions
itten-0.025
 unnecessary-0.024
OO-0.024
op-0.024
 disease-0.024
Token becoming
Token ID5033
Feature activation-2.15e-02

Pos logit contributions
 next+0.023
 Avery+0.020
 Claus+0.019
 timer+0.019
Santa+0.019
Neg logit contributions
itten-0.023
 unnecessary-0.023
 disease-0.022
 fault-0.022
OO-0.022
Token a
Token ID257
Feature activation-1.60e-02

Pos logit contributions
 next+0.032
 Avery+0.030
 timer+0.030
 Claus+0.029
 Maya+0.028
Neg logit contributions
 unnecessary-0.030
 fault-0.028
 disease-0.028
 risks-0.028
itten-0.027
Token brave
Token ID14802
Feature activation-1.43e-02

Pos logit contributions
 next+0.024
 Avery+0.022
 Claus+0.021
 Anna+0.021
 Maya+0.020
Neg logit contributions
 disease-0.022
 unnecessary-0.021
itten-0.021
 fault-0.021
OO-0.020
Token astronaut
Token ID33779
Feature activation-9.28e-02

Pos logit contributions
 next+0.020
 Avery+0.019
Santa+0.018
Cl+0.018
 Claus+0.018
Neg logit contributions
 disease-0.021
itten-0.020
 risks-0.020
 fault-0.019
 muscles-0.019
Token,
Token ID11
Feature activation+2.08e-02

Pos logit contributions
Anna+0.110
Cl+0.109
 Avery+0.104
oe+0.102
 timer+0.101
Neg logit contributions
age-0.144
 fault-0.139
itten-0.138
op-0.137
 injury-0.136
Token flying
Token ID7348
Feature activation-1.36e-02

Pos logit contributions
 fault+0.027
 muscles+0.027
 unnecessary+0.027
 risks+0.026
 disease+0.025
Neg logit contributions
 timer-0.030
oe-0.029
 Amen-0.028
 next-0.028
Cl-0.027
Token through
Token ID832
Feature activation+1.41e-02

Pos logit contributions
 next+0.019
 timer+0.018
 Avery+0.018
 Claus+0.017
 waiter+0.017
Neg logit contributions
 unnecessary-0.019
 risks-0.019
 tal-0.019
 fault-0.018
OO-0.018
Token the
Token ID262
Feature activation+1.58e-02

Pos logit contributions
 unnecessary+0.022
 cob+0.022
 risks+0.021
 tal+0.021
 muscles+0.021
Neg logit contributions
 next-0.017
 timer-0.017
 Claus-0.016
 waiter-0.015
oe-0.015
Token stars
Token ID5788
Feature activation-2.25e-02

Pos logit contributions
 unnecessary+0.020
 risks+0.020
 cob+0.019
 tal+0.019
 disease+0.018
Neg logit contributions
 next-0.025
 timer-0.024
 Claus-0.024
 Avery-0.023
 Anna-0.023
Token Rex
Token ID17853
Feature activation-1.46e-02

Pos logit contributions
 next+0.025
 Avery+0.020
 Anna+0.020
 timer+0.018
 Maya+0.018
Neg logit contributions
itten-0.035
OO-0.035
 unnecessary-0.035
 disease-0.033
 fault-0.033
Token would
Token ID561
Feature activation-2.90e-02

Pos logit contributions
 next+0.023
 Avery+0.021
 timer+0.020
Santa+0.019
 Claus+0.019
Neg logit contributions
itten-0.020
 unnecessary-0.019
OO-0.019
op-0.018
 disease-0.018
Token be
Token ID307
Feature activation-1.91e-02

Pos logit contributions
 next+0.044
 Avery+0.040
 timer+0.038
oe+0.037
 Claus+0.036
Neg logit contributions
 disease-0.039
 fault-0.038
 unnecessary-0.038
itten-0.038
 risks-0.036
Token a
Token ID257
Feature activation-3.55e-02

Pos logit contributions
 Avery+0.029
 next+0.027
 timer+0.027
oe+0.027
ory+0.026
Neg logit contributions
 unnecessary-0.027
 risks-0.025
 fault-0.024
 disease-0.023
 muscles-0.023
Token good
Token ID922
Feature activation-1.82e-02

Pos logit contributions
 Avery+0.057
 next+0.057
 Amber+0.052
 Claus+0.051
oe+0.051
Neg logit contributions
 disease-0.044
 unnecessary-0.044
 risks-0.040
OO-0.040
 fault-0.040
Token lear
Token ID22454
Feature activation-9.26e-02

Pos logit contributions
 next+0.027
 Avery+0.025
 timer+0.024
Santa+0.024
 Amen+0.023
Neg logit contributions
 disease-0.025
 fault-0.025
itten-0.024
OO-0.024
op-0.024
Tokenner
Token ID1008
Feature activation-6.44e-02

Pos logit contributions
 next+0.140
 Avery+0.130
 menus+0.123
 timer+0.122
 waiter+0.121
Neg logit contributions
op-0.126
itten-0.124
ending-0.117
 unnecessary-0.113
OO-0.113
Token and
Token ID290
Feature activation+7.91e-05

Pos logit contributions
 Avery+0.066
 timer+0.061
 next+0.060
 Amen+0.059
Santa+0.059
Neg logit contributions
 fault-0.118
op-0.111
itten-0.110
 disease-0.109
age-0.108
Token look
Token ID804
Feature activation-5.13e-03

Pos logit contributions
 unnecessary+0.000
 disease+0.000
 weaknesses+0.000
 fault+0.000
 risks+0.000
Neg logit contributions
 Avery-0.000
oe-0.000
maid-0.000
Santa-0.000
 next-0.000
Tokener
Token ID263
Feature activation-9.69e-03

Pos logit contributions
 timer+0.009
 Claus+0.009
 Avery+0.009
 next+0.009
 Amen+0.009
Neg logit contributions
 unnecessary-0.006
 risks-0.005
 muscles-0.004
 cob-0.004
 disease-0.004
Token too
Token ID1165
Feature activation+1.73e-03

Pos logit contributions
 next+0.011
 timer+0.011
 Avery+0.011
 Claire+0.010
Santa+0.010
Neg logit contributions
op-0.016
itten-0.016
 fault-0.015
eware-0.015
age-0.015
Token.
Token ID13
Feature activation+5.65e-04

Pos logit contributions
 next+0.032
 waiter+0.030
 Avery+0.029
Santa+0.029
 timer+0.028
Neg logit contributions
itten-0.032
op-0.031
OO-0.029
age-0.029
 tal-0.029
Token It
Token ID632
Feature activation+2.12e-02

Pos logit contributions
OO+0.001
itten+0.001
op+0.001
eware+0.001
 Everything+0.001
Neg logit contributions
 next-0.001
 timer-0.001
 Claus-0.001
maid-0.001
iza-0.001
Token then
Token ID788
Feature activation-2.74e-02

Pos logit contributions
itten+0.031
 unnecessary+0.030
OO+0.030
op+0.029
 disease+0.029
Neg logit contributions
 next-0.031
 Avery-0.026
 timer-0.025
 Claus-0.025
 Anna-0.025
Token lowered
Token ID17788
Feature activation-9.74e-03

Pos logit contributions
 next+0.040
 Avery+0.035
 timer+0.034
 Claus+0.033
Santa+0.032
Neg logit contributions
itten-0.039
OO-0.038
 unnecessary-0.037
op-0.037
 pierced-0.037
Token an
Token ID281
Feature activation+4.53e-03

Pos logit contributions
 Claus+0.016
 timer+0.016
 waiter+0.016
iza+0.015
ory+0.015
Neg logit contributions
 tal-0.010
itten-0.010
 pierced-0.010
 cob-0.010
 disease-0.010
Token astronaut
Token ID33779
Feature activation-9.19e-02

Pos logit contributions
itten+0.008
 unnecessary+0.008
OO+0.007
op+0.007
 fault+0.007
Neg logit contributions
 next-0.006
 Avery-0.005
 timer-0.004
 Claus-0.004
 Anna-0.004
Token who
Token ID508
Feature activation-3.07e-02

Pos logit contributions
 Avery+0.117
 next+0.116
Anna+0.116
Cl+0.115
 waiter+0.114
Neg logit contributions
itten-0.135
age-0.131
op-0.129
OO-0.129
 pierced-0.125
Token waved
Token ID26834
Feature activation-3.99e-02

Pos logit contributions
 next+0.047
 timer+0.041
 Avery+0.041
 Anna+0.039
 waiter+0.039
Neg logit contributions
itten-0.041
op-0.040
 pierced-0.040
OO-0.039
 tal-0.038
Token to
Token ID284
Feature activation+1.23e-02

Pos logit contributions
 next+0.060
 timer+0.058
 Claus+0.056
 Avery+0.053
Anna+0.051
Neg logit contributions
 unnecessary-0.052
itten-0.051
OO-0.051
 pierced-0.050
 risks-0.050
Token Sarah
Token ID10490
Feature activation-1.29e-02

Pos logit contributions
 unnecessary+0.015
itten+0.015
OO+0.015
 risks+0.015
 disease+0.014
Neg logit contributions
 next-0.019
 Claus-0.017
 timer-0.017
 waiter-0.017
 Avery-0.017
Token!
Token ID0
Feature activation+2.07e-02

Pos logit contributions
 waiter+0.015
 next+0.014
 timer+0.014
iza+0.014
Anna+0.014
Neg logit contributions
itten-0.021
op-0.021
 tal-0.019
 cob-0.019
OO-0.019
Token The
Token ID383
Feature activation-2.14e-02

Pos logit contributions
 next+0.005
 timer+0.005
 Avery+0.005
Santa+0.004
 Claus+0.004
Neg logit contributions
OO-0.005
 fault-0.005
op-0.005
eware-0.005
inter-0.005
Token bears
Token ID13062
Feature activation+5.10e-04

Pos logit contributions
 Avery+0.037
iza+0.034
 Maya+0.033
oe+0.033
 next+0.033
Neg logit contributions
 cob-0.024
 unnecessary-0.024
 risks-0.023
 sap-0.022
 muscles-0.022
Token used
Token ID973
Feature activation-1.23e-02

Pos logit contributions
 unnecessary+0.001
itten+0.001
 fault+0.001
OO+0.001
 disease+0.001
Neg logit contributions
 next-0.001
 kindly-0.001
 smiled-0.001
 thanked-0.001
 smiling-0.001
Token their
Token ID511
Feature activation+2.21e-02

Pos logit contributions
 next+0.012
 Anna+0.008
 Avery+0.008
 smiling+0.008
 kindly+0.008
Neg logit contributions
OO-0.026
itten-0.026
op-0.025
uch-0.025
 unnecessary-0.025
Token strong
Token ID1913
Feature activation+3.70e-02

Pos logit contributions
itten+0.028
OO+0.028
op+0.027
eware+0.026
ending+0.025
Neg logit contributions
 next-0.039
 timer-0.031
 smiling-0.031
 kindly-0.030
 smiles-0.030
Token arms
Token ID5101
Feature activation+6.64e-02

Pos logit contributions
itten+0.046
op+0.045
OO+0.044
uch+0.044
liv+0.043
Neg logit contributions
 next-0.065
 smiling-0.053
 kindly-0.051
 timer-0.050
 smiles-0.050
Token to
Token ID284
Feature activation+2.00e-02

Pos logit contributions
itten+0.134
 unnecessary+0.134
OO+0.133
 disease+0.130
 risks+0.126
Neg logit contributions
 next-0.065
 Avery-0.046
 kindly-0.044
 smiling-0.042
 Claus-0.042
Token clear
Token ID1598
Feature activation+2.22e-02

Pos logit contributions
itten+0.030
OO+0.029
 unnecessary+0.029
 fault+0.028
op+0.028
Neg logit contributions
 next-0.030
 Avery-0.025
 timer-0.024
 Anna-0.024
 Claus-0.024
Token away
Token ID1497
Feature activation+1.74e-03

Pos logit contributions
itten+0.040
OO+0.039
 unnecessary+0.038
op+0.038
 disease+0.037
Neg logit contributions
 next-0.027
 Avery-0.021
 Anna-0.020
 timer-0.020
 smiling-0.019
Token the
Token ID262
Feature activation-1.82e-04

Pos logit contributions
itten+0.003
 unnecessary+0.003
OO+0.003
 fault+0.003
 disease+0.003
Neg logit contributions
 next-0.002
 Avery-0.002
 timer-0.002
 Claus-0.002
 Amen-0.002
Token broken
Token ID5445
Feature activation-3.29e-03

Pos logit contributions
 next+0.000
 Avery+0.000
iza+0.000
 Claus+0.000
oe+0.000
Neg logit contributions
 unnecessary-0.000
itten-0.000
 risks-0.000
 cob-0.000
 disease-0.000
Token.
Token ID13
Feature activation-8.46e-03

Pos logit contributions
op+0.015
itten+0.015
age+0.015
 spike+0.014
 disease+0.014
Neg logit contributions
 Avery-0.012
Santa-0.011
Anna-0.011
Cl-0.011
 waiter-0.011
Token It
Token ID632
Feature activation+9.97e-03

Pos logit contributions
 next+0.011
 Avery+0.009
 timer+0.009
Santa+0.008
iza+0.008
Neg logit contributions
itten-0.014
OO-0.013
op-0.013
 disease-0.013
 muscles-0.013
Token was
Token ID373
Feature activation+1.73e-03

Pos logit contributions
itten+0.015
OO+0.014
op+0.014
 pierced+0.013
 disease+0.013
Neg logit contributions
 next-0.014
 Avery-0.013
 timer-0.012
 Claus-0.012
Santa-0.012
Token small
Token ID1402
Feature activation+3.04e-02

Pos logit contributions
itten+0.003
 disease+0.003
 unnecessary+0.003
OO+0.003
 fault+0.003
Neg logit contributions
 next-0.002
 Avery-0.002
 timer-0.002
 Claus-0.002
Santa-0.002
Token and
Token ID290
Feature activation-2.07e-02

Pos logit contributions
itten+0.065
op+0.062
 unnecessary+0.062
OO+0.062
 disease+0.061
Neg logit contributions
 next-0.024
 Avery-0.020
 timer-0.018
Santa-0.017
 Claus-0.017
Token black
Token ID2042
Feature activation+7.00e-02

Pos logit contributions
 next+0.027
 Avery+0.026
 timer+0.024
 Claus+0.023
May+0.023
Neg logit contributions
itten-0.031
 cob-0.030
 disease-0.030
 unnecessary-0.029
 pierced-0.029
Token and
Token ID290
Feature activation-1.77e-02

Pos logit contributions
 unnecessary+0.168
itten+0.167
OO+0.164
 fault+0.162
 disease+0.162
Neg logit contributions
 next-0.043
 Anna-0.021
 Avery-0.021
 timer-0.019
 smiling-0.018
Token had
Token ID550
Feature activation+1.58e-02

Pos logit contributions
 next+0.022
 Avery+0.020
 timer+0.019
 Claus+0.018
Santa+0.018
Neg logit contributions
itten-0.028
 pierced-0.027
 disease-0.026
 cob-0.026
 unnecessary-0.026
Token a
Token ID257
Feature activation+1.34e-02

Pos logit contributions
itten+0.026
OO+0.025
op+0.025
 unnecessary+0.025
 disease+0.024
Neg logit contributions
 next-0.021
 Anna-0.017
 Avery-0.017
 timer-0.016
Santa-0.016
Token loud
Token ID7812
Feature activation+3.02e-02

Pos logit contributions
 unnecessary+0.022
itten+0.022
ilst+0.021
OO+0.021
 fault+0.021
Neg logit contributions
 next-0.017
 timer-0.014
 Avery-0.014
 smiling-0.014
 Claus-0.014
Token voice
Token ID3809
Feature activation+6.18e-03

Pos logit contributions
 unnecessary+0.049
 fault+0.048
itten+0.048
ending+0.047
ilst+0.047
Neg logit contributions
 next-0.039
 smiling-0.036
 timer-0.033
 Anna-0.032
 kindly-0.031
Token is
Token ID318
Feature activation+2.33e-02

Pos logit contributions
itten+0.001
OO+0.001
 fault+0.001
op+0.001
 disease+0.001
Neg logit contributions
 next-0.001
 Avery-0.001
 timer-0.001
 Claus-0.001
Santa-0.001
Token:
Token ID25
Feature activation+5.19e-02

Pos logit contributions
itten+0.028
 unnecessary+0.028
op+0.028
 fault+0.028
 disease+0.027
Neg logit contributions
 next-0.038
 timer-0.036
 Avery-0.035
 Claus-0.034
Santa-0.033
Token Don
Token ID2094
Feature activation-4.58e-04

Pos logit contributions
 fault+0.072
op+0.072
itten+0.072
 unnecessary+0.071
ending+0.071
Neg logit contributions
 timer-0.069
 next-0.068
 Claus-0.067
 Avery-0.065
oe-0.064
Tokenâ
Token ID22940
Feature activation-8.29e-03

Pos logit contributions
 next+0.001
 Avery+0.001
 Anna+0.001
 timer+0.001
 Claus+0.001
Neg logit contributions
itten-0.001
 unnecessary-0.001
OO-0.001
 fault-0.001
 disease-0.001
Token
Token ID26391
Feature activation+8.35e-03

Pos logit contributions
 next+0.015
 Avery+0.013
 timer+0.013
 Claus+0.013
 Anna+0.012
Neg logit contributions
itten-0.010
 unnecessary-0.009
OO-0.009
op-0.009
 disease-0.009
Token
Token ID8151
Feature activation+7.01e-02

Pos logit contributions
itten+0.006
OO+0.005
 unnecessary+0.005
op+0.005
 fault+0.005
Neg logit contributions
 next-0.019
 Avery-0.017
 timer-0.017
 Claus-0.016
 Anna-0.016
Tokent
Token ID83
Feature activation+2.35e-02

Pos logit contributions
itten+0.062
op+0.059
OO+0.056
 fault+0.052
 unnecessary+0.051
Neg logit contributions
 next-0.147
 Avery-0.135
 timer-0.133
 Claus-0.129
Santa-0.128
Token stir
Token ID11240
Feature activation+4.42e-03

Pos logit contributions
itten+0.034
 unnecessary+0.033
OO+0.033
op+0.032
 disease+0.032
Neg logit contributions
 next-0.035
 Avery-0.030
 timer-0.029
 Claus-0.028
 Anna-0.028
Token things
Token ID1243
Feature activation+3.04e-03

Pos logit contributions
itten+0.006
 unnecessary+0.006
OO+0.006
op+0.006
 disease+0.006
Neg logit contributions
 next-0.007
 Avery-0.006
 timer-0.006
 Claus-0.006
Santa-0.005
Token up
Token ID510
Feature activation-1.43e-02

Pos logit contributions
 unnecessary+0.005
itten+0.005
 fault+0.005
op+0.004
OO+0.004
Neg logit contributions
 next-0.004
 timer-0.004
 Avery-0.004
 Amen-0.003
 Claus-0.003
Token if
Token ID611
Feature activation-2.05e-02

Pos logit contributions
 timer+0.019
 Claus+0.019
 Amen+0.018
 waiter+0.017
 Avery+0.017
Neg logit contributions
 unnecessary-0.022
 risks-0.022
 fault-0.021
OO-0.020
 tal-0.020
Token church
Token ID4928
Feature activation-3.04e-02

Pos logit contributions
 next+0.035
 Avery+0.030
 timer+0.028
 Claus+0.028
 Anna+0.028
Neg logit contributions
itten-0.045
 unnecessary-0.043
OO-0.043
op-0.042
 disease-0.042
Token was
Token ID373
Feature activation+4.64e-03

Pos logit contributions
 next+0.046
 Avery+0.044
 Amber+0.041
Santa+0.041
Cl+0.040
Neg logit contributions
itten-0.039
op-0.038
OO-0.035
 disease-0.035
age-0.035
Token big
Token ID1263
Feature activation+2.59e-02

Pos logit contributions
itten+0.007
 disease+0.006
 fault+0.006
OO+0.006
 cob+0.006
Neg logit contributions
 next-0.006
 Avery-0.006
 Claus-0.006
 Amber-0.006
maid-0.006
Token and
Token ID290
Feature activation-2.15e-02

Pos logit contributions
 muscles+0.039
 cob+0.038
itten+0.038
 fault+0.038
op+0.038
Neg logit contributions
Cl-0.033
 Avery-0.030
iza-0.029
Friend-0.029
 next-0.029
Token had
Token ID550
Feature activation+1.22e-02

Pos logit contributions
 next+0.028
 Avery+0.025
 timer+0.023
 Claus+0.023
 Amber+0.022
Neg logit contributions
itten-0.034
 unnecessary-0.032
OO-0.032
op-0.031
 disease-0.031
Token thick
Token ID6546
Feature activation+7.87e-02

Pos logit contributions
itten+0.016
 unnecessary+0.015
OO+0.015
 disease+0.015
 fault+0.015
Neg logit contributions
 next-0.020
 Avery-0.017
 timer-0.017
 Claus-0.017
oe-0.016
Token walls
Token ID7714
Feature activation+2.98e-02

Pos logit contributions
 unnecessary+0.129
itten+0.128
OO+0.126
op+0.122
 fault+0.122
Neg logit contributions
 next-0.106
 timer-0.085
 Avery-0.084
 smiling-0.081
 Anna-0.080
Token that
Token ID326
Feature activation+1.96e-02

Pos logit contributions
itten+0.051
op+0.050
age+0.049
 fault+0.048
ending+0.047
Neg logit contributions
 Avery-0.033
 next-0.031
 Amber-0.030
 Amen-0.030
 Maya-0.029
Token kept
Token ID4030
Feature activation-9.78e-03

Pos logit contributions
itten+0.029
 unnecessary+0.028
OO+0.028
op+0.028
 disease+0.027
Neg logit contributions
 next-0.028
 Avery-0.024
 timer-0.023
 Claus-0.022
 Anna-0.022
Token it
Token ID340
Feature activation+2.57e-02

Pos logit contributions
 next+0.013
 Avery+0.012
Anna+0.012
 timer+0.012
 waiter+0.011
Neg logit contributions
 unnecessary-0.014
itten-0.014
OO-0.013
 disease-0.013
 risks-0.013
Token warm
Token ID5814
Feature activation+2.77e-02

Pos logit contributions
itten+0.041
 unnecessary+0.040
 pierced+0.039
OO+0.039
 fault+0.039
Neg logit contributions
 Avery-0.030
 next-0.029
 Amber-0.027
 waiter-0.027
May-0.027
Token the
Token ID262
Feature activation+1.20e-02

Pos logit contributions
 next+0.038
 Avery+0.030
 Anna+0.029
 smiling+0.027
 Maya+0.026
Neg logit contributions
OO-0.068
itten-0.068
 unnecessary-0.065
op-0.064
ending-0.064
Token popcorn
Token ID38914
Feature activation+9.01e-03

Pos logit contributions
itten+0.018
 unnecessary+0.017
 disease+0.017
 fault+0.016
op+0.016
Neg logit contributions
 next-0.017
 Avery-0.016
 Claus-0.014
Santa-0.014
 Amber-0.014
Token.
Token ID13
Feature activation+4.29e-03

Pos logit contributions
itten+0.016
op+0.016
 fault+0.015
 disease+0.015
ops+0.015
Neg logit contributions
 next-0.009
 Avery-0.009
 Amen-0.009
Anna-0.008
May-0.008
Token They
Token ID1119
Feature activation+1.90e-02

Pos logit contributions
itten+0.006
OO+0.005
op+0.005
 fault+0.005
inter+0.005
Neg logit contributions
 next-0.007
 timer-0.006
 Claus-0.006
Santa-0.006
 sailor-0.006
Token spread
Token ID4104
Feature activation+8.04e-03

Pos logit contributions
itten+0.027
op+0.026
OO+0.026
 unnecessary+0.025
 disease+0.025
Neg logit contributions
 next-0.029
 Avery-0.025
 timer-0.024
 Claus-0.023
Santa-0.023
Token thick
Token ID6546
Feature activation+6.52e-02

Pos logit contributions
itten+0.013
 unnecessary+0.013
OO+0.013
op+0.013
 disease+0.013
Neg logit contributions
 next-0.010
 Avery-0.009
 timer-0.008
 Anna-0.008
 Claus-0.008
Token blankets
Token ID34794
Feature activation+9.54e-03

Pos logit contributions
itten+0.084
 unnecessary+0.081
OO+0.081
 disease+0.078
 fault+0.078
Neg logit contributions
 next-0.109
 Avery-0.094
 timer-0.091
 Anna-0.089
 Claus-0.089
Token on
Token ID319
Feature activation+4.31e-02

Pos logit contributions
itten+0.017
op+0.017
 fault+0.016
 pierced+0.016
OO+0.016
Neg logit contributions
 next-0.010
 Avery-0.009
 Amen-0.009
Anna-0.009
Santa-0.008
Token the
Token ID262
Feature activation+2.26e-02

Pos logit contributions
itten+0.069
op+0.066
OO+0.066
 unnecessary+0.066
 disease+0.066
Neg logit contributions
 next-0.055
 Claus-0.046
 Avery-0.046
 timer-0.044
 waiter-0.044
Token floor
Token ID4314
Feature activation+2.62e-02

Pos logit contributions
itten+0.028
 cob+0.027
 spike+0.026
 disease+0.025
op+0.025
Neg logit contributions
 Avery-0.034
 Amen-0.034
 next-0.033
 Amber-0.032
Friend-0.032
Token and
Token ID290
Feature activation+1.36e-02

Pos logit contributions
 fault+0.041
op+0.041
itten+0.040
 cob+0.039
OO+0.038
Neg logit contributions
 Amen-0.033
 Avery-0.029
 next-0.029
May-0.029
Santa-0.029
Token a
Token ID257
Feature activation-1.92e-03

Pos logit contributions
 next+0.068
 Claus+0.067
 timer+0.066
oe+0.065
 Avery+0.063
Neg logit contributions
 unnecessary-0.056
 cob-0.053
 disease-0.053
 risks-0.053
 fault-0.053
Token very
Token ID845
Feature activation-5.95e-03

Pos logit contributions
 next+0.003
 Avery+0.003
iza+0.003
 menus+0.003
 Claus+0.003
Neg logit contributions
 cob-0.002
 spike-0.002
 disease-0.002
itten-0.002
 tal-0.002
Token special
Token ID2041
Feature activation-3.44e-04

Pos logit contributions
 Claus+0.011
 next+0.011
 Avery+0.011
 sailors+0.010
 Amen+0.010
Neg logit contributions
 disease-0.005
 tum-0.005
 unnecessary-0.005
 cob-0.005
 spike-0.005
Token gift
Token ID6979
Feature activation-1.18e-02

Pos logit contributions
 next+0.001
 Avery+0.001
 timer+0.001
 Claus+0.001
Santa+0.001
Neg logit contributions
itten-0.000
 disease-0.000
 tal-0.000
 cob-0.000
 fault-0.000
Token.
Token ID13
Feature activation-4.10e-03

Pos logit contributions
 next+0.014
 Avery+0.014
 Amen+0.014
 timer+0.014
Cl+0.013
Neg logit contributions
itten-0.018
age-0.017
 fault-0.017
op-0.016
 cob-0.016
Token Let
Token ID3914
Feature activation+2.00e-02

Pos logit contributions
 next+0.008
 timer+0.007
 Claus+0.007
maid+0.007
guard+0.007
Neg logit contributions
 Everything-0.004
itten-0.004
op-0.004
 Ow-0.004
inter-0.004
Token's
Token ID338
Feature activation+4.79e-02

Pos logit contributions
itten+0.029
 unnecessary+0.028
OO+0.028
 disease+0.027
op+0.027
Neg logit contributions
 next-0.030
 Avery-0.026
 timer-0.024
 Anna-0.024
 Claus-0.024
Token keep
Token ID1394
Feature activation-1.08e-02

Pos logit contributions
itten+0.055
op+0.052
age+0.052
 spike+0.051
OO+0.051
Neg logit contributions
 next-0.080
 timer-0.072
 Avery-0.071
Santa-0.071
Cl-0.069
Token it
Token ID340
Feature activation+2.95e-02

Pos logit contributions
 timer+0.022
Anna+0.022
 next+0.022
ider+0.022
 waiter+0.021
Neg logit contributions
 unnecessary-0.008
 tal-0.007
 risks-0.006
 spike-0.006
 disease-0.006
Token safe
Token ID3338
Feature activation+1.75e-02

Pos logit contributions
itten+0.036
age+0.035
 footing+0.034
 tal+0.033
 unnecessary+0.033
Neg logit contributions
 Avery-0.043
 waiter-0.043
 timer-0.043
Say-0.043
 Amen-0.042
Token.
Token ID13
Feature activation+1.94e-03

Pos logit contributions
 fault+0.025
age+0.022
 risks+0.022
 footing+0.022
OO+0.022
Neg logit contributions
 timer-0.023
Say-0.022
 Avery-0.022
Friend-0.022
 waiter-0.021
Token asked
Token ID1965
Feature activation+6.97e-03

Pos logit contributions
 next+0.011
Santa+0.010
 timer+0.010
 sailors+0.010
 Avery+0.010
Neg logit contributions
itten-0.012
op-0.011
OO-0.011
 cob-0.011
 pierced-0.011
Token,
Token ID11
Feature activation+1.81e-02

Pos logit contributions
itten+0.009
OO+0.009
 unnecessary+0.009
 disease+0.009
op+0.009
Neg logit contributions
 next-0.010
 timer-0.009
 Claus-0.009
 Avery-0.009
Santa-0.009
Token "
Token ID366
Feature activation+1.69e-02

Pos logit contributions
op+0.014
itten+0.014
OO+0.013
 spike+0.011
 oh+0.011
Neg logit contributions
 menus-0.034
iza-0.032
 timer-0.032
 invites-0.032
guard-0.032
TokenMom
Token ID29252
Feature activation+2.14e-02

Pos logit contributions
itten+0.023
op+0.021
ending+0.021
OO+0.020
 spike+0.020
Neg logit contributions
 thanks-0.022
 Claus-0.022
 next-0.021
oe-0.021
 Avery-0.021
Tokenmy
Token ID1820
Feature activation+5.40e-03

Pos logit contributions
itten+0.028
OO+0.027
op+0.027
 fault+0.026
're+0.025
Neg logit contributions
 next-0.032
Cl-0.031
Santa-0.031
 timer-0.030
 Avery-0.029
Token,
Token ID11
Feature activation+1.64e-02

Pos logit contributions
op+0.006
itten+0.006
OO+0.006
 fault+0.006
 disease+0.005
Neg logit contributions
 next-0.009
 timer-0.009
Santa-0.009
Cl-0.009
Say-0.009
Token why
Token ID1521
Feature activation+5.75e-03

Pos logit contributions
itten+0.022
OO+0.022
op+0.021
 unnecessary+0.021
 disease+0.021
Neg logit contributions
 next-0.025
 timer-0.022
 Avery-0.021
oe-0.021
 Claus-0.021
Token aren
Token ID3588
Feature activation-1.42e-02

Pos logit contributions
itten+0.008
 unnecessary+0.008
OO+0.007
 disease+0.007
op+0.007
Neg logit contributions
 next-0.009
 Avery-0.008
 timer-0.008
 Anna-0.008
 Claus-0.008
Token't
Token ID470
Feature activation+2.69e-03

Pos logit contributions
 next+0.024
 Avery+0.021
 timer+0.020
 Claus+0.020
Santa+0.020
Neg logit contributions
itten-0.018
 unnecessary-0.017
OO-0.017
op-0.016
 disease-0.016
Token the
Token ID262
Feature activation-1.66e-02

Pos logit contributions
itten+0.005
 unnecessary+0.005
OO+0.005
 disease+0.005
 fault+0.005
Neg logit contributions
 next-0.003
 Avery-0.002
 timer-0.002
 Claus-0.002
Santa-0.002
Token stars
Token ID5788
Feature activation-3.88e-02

Pos logit contributions
 next+0.026
 Avery+0.025
oe+0.024
 Claus+0.024
Cl+0.024
Neg logit contributions
 fault-0.021
 disease-0.021
itten-0.021
 cob-0.020
 unnecessary-0.020
Token Mom
Token ID11254
Feature activation+8.95e-03

Pos logit contributions
 next+0.000
 timer+0.000
 Claus+0.000
 Avery+0.000
Santa+0.000
Neg logit contributions
itten-0.000
OO-0.000
op-0.000
 fault-0.000
 unnecessary-0.000
Token said
Token ID531
Feature activation+3.93e-03

Pos logit contributions
itten+0.013
op+0.013
ilst+0.012
ending+0.012
OO+0.011
Neg logit contributions
 next-0.012
 Amen-0.011
 Avery-0.011
Cl-0.011
Santa-0.011
Token it
Token ID340
Feature activation+2.35e-02

Pos logit contributions
itten+0.006
op+0.005
OO+0.005
 unnecessary+0.005
 disease+0.005
Neg logit contributions
 next-0.006
 Claus-0.005
 timer-0.005
 Avery-0.005
Santa-0.005
Token was
Token ID373
Feature activation+1.08e-02

Pos logit contributions
itten+0.038
 unnecessary+0.037
OO+0.036
 disease+0.035
 fault+0.035
Neg logit contributions
 next-0.033
 Avery-0.027
 Anna-0.026
 timer-0.026
 Claus-0.026
Token time
Token ID640
Feature activation-3.76e-03

Pos logit contributions
 unnecessary+0.011
itten+0.011
 fault+0.011
 disease+0.011
OO+0.011
Neg logit contributions
 next-0.018
 Claus-0.018
 timer-0.018
 Avery-0.018
 lad-0.017
Token to
Token ID284
Feature activation+2.72e-02

Pos logit contributions
 next+0.006
 timer+0.005
 Avery+0.005
Anna+0.005
 Claus+0.005
Neg logit contributions
itten-0.005
 fault-0.005
OO-0.005
op-0.005
 unnecessary-0.005
Token have
Token ID423
Feature activation+1.03e-02

Pos logit contributions
itten+0.022
 risks+0.020
 itch+0.020
 footing+0.020
 disease+0.020
Neg logit contributions
 next-0.050
 Claus-0.050
ider-0.050
keeper-0.049
 timer-0.048
Token a
Token ID257
Feature activation+4.38e-03

Pos logit contributions
itten+0.015
OO+0.015
 unnecessary+0.014
op+0.014
 disease+0.014
Neg logit contributions
 next-0.016
 Avery-0.013
 timer-0.013
 Anna-0.013
 Claus-0.012
Token snack
Token ID26906
Feature activation-4.15e-03

Pos logit contributions
itten+0.005
 unnecessary+0.005
 disease+0.005
OO+0.005
 fault+0.005
Neg logit contributions
 next-0.008
 Avery-0.007
 Claus-0.006
Santa-0.006
 timer-0.006
Token.
Token ID13
Feature activation-3.43e-03

Pos logit contributions
 next+0.004
 Avery+0.004
 Amen+0.004
 Claus+0.004
Santa+0.004
Neg logit contributions
itten-0.007
op-0.007
 fault-0.007
OO-0.007
 disease-0.007
Token She
Token ID1375
Feature activation-3.82e-03

Pos logit contributions
 next+0.005
 Claus+0.005
 timer+0.004
 sailors+0.004
 sailor+0.004
Neg logit contributions
op-0.005
itten-0.005
inter-0.005
OO-0.005
 Ow-0.005
Token between
Token ID1022
Feature activation+1.28e-02

Pos logit contributions
op+0.094
itten+0.089
 cob+0.088
age+0.087
 risks+0.087
Neg logit contributions
 Amen-0.056
 Avery-0.054
 Claus-0.054
 next-0.053
Santa-0.052
Token them
Token ID606
Feature activation+2.03e-02

Pos logit contributions
itten+0.017
 spike+0.017
 fault+0.017
 tal+0.017
 unnecessary+0.017
Neg logit contributions
 Claus-0.018
 next-0.016
ider-0.015
 Avery-0.015
Santa-0.015
Token.
Token ID13
Feature activation+1.38e-02

Pos logit contributions
op+0.033
ops+0.031
 muscles+0.031
 fault+0.031
itten+0.031
Neg logit contributions
 Amen-0.023
 Avery-0.023
Anna-0.023
iza-0.022
Santa-0.022
Token Jane
Token ID12091
Feature activation+1.05e-02

Pos logit contributions
OO+0.017
itten+0.017
 fault+0.017
op+0.017
 Everything+0.017
Neg logit contributions
 next-0.022
 Claus-0.020
 timer-0.019
 sailor-0.019
Santa-0.018
Token smiled
Token ID13541
Feature activation-1.94e-02

Pos logit contributions
op+0.016
itten+0.016
OO+0.014
 pierced+0.014
 tal+0.014
Neg logit contributions
 next-0.014
 Avery-0.013
 timer-0.013
Santa-0.013
 Claus-0.012
Token,
Token ID11
Feature activation+1.54e-02

Pos logit contributions
 timer+0.027
Anna+0.025
iza+0.025
 menus+0.024
Say+0.023
Neg logit contributions
 muscles-0.024
OO-0.024
 tal-0.024
 bal-0.024
 fault-0.024
Token as
Token ID355
Feature activation-8.64e-03

Pos logit contributions
itten+0.023
 tal+0.021
apy+0.021
 cob+0.020
 disease+0.020
Neg logit contributions
 next-0.018
 Claus-0.018
oe-0.018
 menus-0.018
 Avery-0.018
Token it
Token ID340
Feature activation+2.38e-02

Pos logit contributions
 timer+0.014
ider+0.014
 Claus+0.013
 next+0.013
 Amen+0.012
Neg logit contributions
 happened-0.010
apy-0.010
 cob-0.009
 mattered-0.009
 weren-0.009
Token was
Token ID373
Feature activation+1.82e-02

Pos logit contributions
itten+0.031
op+0.030
 pierced+0.029
OO+0.029
 cob+0.028
Neg logit contributions
 next-0.037
 Avery-0.034
Santa-0.032
 Claus-0.032
 Maya-0.032
Token the
Token ID262
Feature activation-1.11e-02

Pos logit contributions
itten+0.023
 fault+0.023
 pierced+0.022
OO+0.022
 tal+0.022
Neg logit contributions
 next-0.027
 Claus-0.025
Anna-0.025
Santa-0.025
 Avery-0.025
Token y
Token ID331
Feature activation+2.99e-02

Pos logit contributions
 Claus+0.019
 Avery+0.019
 next+0.018
Santa+0.018
 Maya+0.018
Neg logit contributions
 fault-0.011
itten-0.011
 cob-0.010
 disease-0.010
 unnecessary-0.010
Token will
Token ID481
Feature activation-1.33e-02

Pos logit contributions
itten+0.000
 unnecessary+0.000
OO+0.000
op+0.000
 fault+0.000
Neg logit contributions
 next-0.001
 Avery-0.000
 timer-0.000
Santa-0.000
 Anna-0.000
Token score
Token ID4776
Feature activation-6.73e-03

Pos logit contributions
 next+0.021
 timer+0.019
 Avery+0.019
 waiter+0.019
 Claus+0.018
Neg logit contributions
 unnecessary-0.016
 fault-0.016
itten-0.016
OO-0.016
 disease-0.016
Token."
Token ID526
Feature activation-4.02e-03

Pos logit contributions
 timer+0.009
 next+0.009
 Claus+0.009
 Avery+0.009
 Amen+0.009
Neg logit contributions
 unnecessary-0.010
 risks-0.009
 fault-0.009
OO-0.009
 tal-0.009
Token The
Token ID383
Feature activation-8.27e-03

Pos logit contributions
 next+0.006
 timer+0.005
 bride+0.005
oe+0.005
 Claus+0.005
Neg logit contributions
op-0.006
itten-0.006
 tal-0.005
 fault-0.005
 Ow-0.005
Token little
Token ID1310
Feature activation+1.71e-03

Pos logit contributions
 Avery+0.011
 next+0.010
 Maya+0.010
 Claus+0.010
 Amber+0.010
Neg logit contributions
itten-0.012
 spike-0.012
 disease-0.012
 cob-0.012
OO-0.012
Token bird
Token ID6512
Feature activation+2.09e-02

Pos logit contributions
itten+0.002
 spike+0.002
 unnecessary+0.002
 disease+0.002
OO+0.002
Neg logit contributions
 next-0.003
 Avery-0.003
 Claus-0.002
Santa-0.002
 timer-0.002
Token practiced
Token ID19893
Feature activation-2.21e-02

Pos logit contributions
itten+0.029
 tal+0.027
age+0.027
 pierced+0.027
op+0.027
Neg logit contributions
 next-0.030
 Avery-0.029
Santa-0.027
 timer-0.027
 Claus-0.027
Token every
Token ID790
Feature activation-1.34e-02

Pos logit contributions
 next+0.030
 timer+0.027
 Avery+0.026
 Claus+0.026
 waiter+0.026
Neg logit contributions
 unnecessary-0.032
itten-0.032
OO-0.031
 fault-0.031
 disease-0.031
Token day
Token ID1110
Feature activation-2.22e-02

Pos logit contributions
 next+0.023
 Avery+0.021
 Claus+0.021
 timer+0.020
Santa+0.020
Neg logit contributions
itten-0.016
OO-0.015
 unnecessary-0.015
 disease-0.015
op-0.014
Token,
Token ID11
Feature activation+2.54e-02

Pos logit contributions
 next+0.023
 timer+0.020
 Avery+0.019
 Claus+0.019
Santa+0.018
Neg logit contributions
itten-0.040
op-0.040
 unnecessary-0.040
OO-0.039
 risks-0.038
Token and
Token ID290
Feature activation+6.89e-03

Pos logit contributions
itten+0.041
 fault+0.040
 unnecessary+0.040
 muscles+0.040
op+0.040
Neg logit contributions
 next-0.028
 timer-0.026
 Claus-0.026
oe-0.026
Cl-0.026
Token.
Token ID13
Feature activation+1.59e-03

Pos logit contributions
 unnecessary+0.017
itten+0.017
 fault+0.017
op+0.017
OO+0.017
Neg logit contributions
 next-0.011
 Avery-0.010
 timer-0.010
 Amen-0.009
Santa-0.009
Token
Token ID220
Feature activation+4.24e-02

Pos logit contributions
itten+0.002
OO+0.002
 fault+0.002
 unnecessary+0.002
 tal+0.002
Neg logit contributions
 next-0.002
 Avery-0.002
 Claus-0.002
 sailor-0.002
 timer-0.002
Token
Token ID198
Feature activation+6.11e-03

Pos logit contributions
itten+0.080
OO+0.078
op+0.077
 unnecessary+0.077
 fault+0.077
Neg logit contributions
 next-0.045
 Avery-0.036
 timer-0.036
 Claus-0.033
Santa-0.032
Token
Token ID198
Feature activation-2.97e-03

Pos logit contributions
itten+0.010
op+0.010
OO+0.010
 unnecessary+0.010
age+0.010
Neg logit contributions
 next-0.007
 Avery-0.006
 Claus-0.006
 timer-0.006
 sailor-0.006
TokenL
Token ID43
Feature activation-3.90e-02

Pos logit contributions
 next+0.004
 Avery+0.004
 Claus+0.004
 Maya+0.003
 Amber+0.003
Neg logit contributions
itten-0.004
OO-0.004
 unnecessary-0.004
op-0.004
 disease-0.004
Tokenily
Token ID813
Feature activation+5.01e-03

Pos logit contributions
 next+0.049
 Avery+0.043
 Claus+0.043
 timer+0.041
Santa+0.040
Neg logit contributions
itten-0.065
OO-0.065
 unnecessary-0.062
 fault-0.062
op-0.061
Token and
Token ID290
Feature activation+2.81e-02

Pos logit contributions
itten+0.008
op+0.008
age+0.008
 tal+0.008
OO+0.007
Neg logit contributions
 next-0.006
 Avery-0.006
 timer-0.006
 Amen-0.005
 menus-0.005
Token Tim
Token ID5045
Feature activation+4.12e-02

Pos logit contributions
itten+0.037
 unnecessary+0.036
OO+0.035
op+0.034
 disease+0.034
Neg logit contributions
 next-0.047
 Avery-0.041
 timer-0.039
 Anna-0.039
 Claus-0.039
Tokenmy
Token ID1820
Feature activation+3.41e-02

Pos logit contributions
itten+0.075
 unnecessary+0.073
OO+0.073
op+0.071
 disease+0.071
Neg logit contributions
 next-0.047
 Avery-0.038
 timer-0.036
 Claus-0.035
 Anna-0.034
Token were
Token ID547
Feature activation+3.40e-02

Pos logit contributions
itten+0.054
 unnecessary+0.053
 fault+0.052
OO+0.052
 disease+0.051
Neg logit contributions
 next-0.049
 Avery-0.039
 Claus-0.038
 Anna-0.038
 kindly-0.037
Token happy
Token ID3772
Feature activation+8.68e-04

Pos logit contributions
itten+0.064
OO+0.062
 unnecessary+0.062
op+0.061
 disease+0.060
Neg logit contributions
 next-0.038
 Avery-0.029
 Anna-0.028
 timer-0.027
 Claus-0.027
Token a
Token ID257
Feature activation+6.49e-03

Pos logit contributions
itten+0.007
 unnecessary+0.007
OO+0.007
 disease+0.007
op+0.007
Neg logit contributions
 next-0.010
 Avery-0.008
 timer-0.008
 Claus-0.008
 Anna-0.008
Token whistle
Token ID16121
Feature activation+2.21e-03

Pos logit contributions
 spike+0.008
itten+0.007
 cob+0.007
 disease+0.007
OO+0.007
Neg logit contributions
 next-0.011
 Avery-0.010
iza-0.010
 Claire-0.010
Cl-0.010
Token.
Token ID13
Feature activation+1.20e-02

Pos logit contributions
op+0.004
age+0.004
OO+0.004
itten+0.004
eware+0.004
Neg logit contributions
 Avery-0.002
 next-0.002
iza-0.002
 Claire-0.002
 Amber-0.002
Token It
Token ID632
Feature activation+2.16e-02

Pos logit contributions
itten+0.012
op+0.012
OO+0.012
eware+0.012
 Ow+0.012
Neg logit contributions
 next-0.021
 timer-0.020
oe-0.020
 Avery-0.019
 Claus-0.019
Token can
Token ID460
Feature activation-2.23e-02

Pos logit contributions
itten+0.032
op+0.031
 unnecessary+0.031
 fault+0.031
OO+0.031
Neg logit contributions
 next-0.029
 Avery-0.027
 Claire-0.025
 timer-0.025
Cl-0.025
Token go
Token ID467
Feature activation-1.99e-02

Pos logit contributions
 next+0.035
 Avery+0.031
 timer+0.029
 Claus+0.029
 Anna+0.028
Neg logit contributions
itten-0.030
 unnecessary-0.029
OO-0.029
op-0.028
 disease-0.028
Token fast
Token ID3049
Feature activation+3.08e-03

Pos logit contributions
 Avery+0.029
 Amber+0.028
 Claire+0.027
Marie+0.027
 waiter+0.026
Neg logit contributions
 unnecessary-0.029
 risks-0.027
ch-0.027
 spike-0.027
eware-0.026
Token and
Token ID290
Feature activation-1.53e-02

Pos logit contributions
 risks+0.005
 unnecessary+0.005
 fault+0.005
 muscles+0.005
OO+0.005
Neg logit contributions
 Avery-0.003
 Amber-0.003
 timer-0.003
iza-0.003
 Claire-0.003
Token slow
Token ID3105
Feature activation+2.22e-02

Pos logit contributions
 next+0.021
 Avery+0.020
Cl+0.019
iza+0.019
 Claire+0.019
Neg logit contributions
 unnecessary-0.023
 risks-0.021
 disease-0.020
 muscles-0.020
itten-0.020
Token.
Token ID13
Feature activation+8.67e-03

Pos logit contributions
 unnecessary+0.041
 risks+0.041
 muscles+0.040
 fault+0.040
OO+0.040
Neg logit contributions
 Avery-0.021
 timer-0.020
 next-0.020
iza-0.019
 Amber-0.019
Token Do
Token ID2141
Feature activation-4.21e-03

Pos logit contributions
itten+0.012
OO+0.012
 unnecessary+0.011
op+0.011
eware+0.011
Neg logit contributions
 next-0.013
 timer-0.012
oe-0.011
 Avery-0.011
 Claus-0.011
Token.
Token ID13
Feature activation-5.10e-03

Pos logit contributions
 next+0.022
 timer+0.022
 Avery+0.021
 Amen+0.021
Santa+0.021
Neg logit contributions
OO-0.034
itten-0.033
 unnecessary-0.032
 fault-0.032
 pierced-0.032
Token Then
Token ID3244
Feature activation-2.47e-02

Pos logit contributions
 next+0.007
 timer+0.007
 bride+0.007
maid+0.007
iza+0.006
Neg logit contributions
 Everything-0.007
OO-0.007
itten-0.006
 tal-0.006
op-0.006
Token John
Token ID1757
Feature activation-1.40e-02

Pos logit contributions
 next+0.035
 timer+0.034
Cl+0.033
 Avery+0.033
Anna+0.032
Neg logit contributions
 muscles-0.029
 pierced-0.029
apy-0.029
 happened-0.029
 risks-0.029
Token never
Token ID1239
Feature activation-2.94e-03

Pos logit contributions
 Amen+0.023
 next+0.022
 timer+0.022
 Avery+0.021
 menus+0.021
Neg logit contributions
itten-0.016
 pierced-0.016
op-0.016
ched-0.015
OO-0.015
Token woke
Token ID19092
Feature activation+1.50e-03

Pos logit contributions
 next+0.004
 smiling+0.003
 Anna+0.003
 Avery+0.003
 smiled+0.003
Neg logit contributions
itten-0.005
op-0.005
OO-0.005
 unnecessary-0.005
 fault-0.005
Token up
Token ID510
Feature activation-2.55e-02

Pos logit contributions
 unnecessary+0.002
 fault+0.002
 spike+0.002
 risks+0.002
 disease+0.002
Neg logit contributions
 timer-0.002
 next-0.002
 waiter-0.002
Friend-0.002
Santa-0.002
Token again
Token ID757
Feature activation-2.79e-02

Pos logit contributions
 timer+0.043
Anna+0.040
 Amen+0.040
 Claus+0.039
Santa+0.038
Neg logit contributions
 fault-0.031
 injury-0.030
 badly-0.029
 disease-0.029
 risks-0.028
Token.
Token ID13
Feature activation-1.70e-03

Pos logit contributions
 timer+0.034
 Amen+0.032
 Claus+0.032
Santa+0.032
 Avery+0.032
Neg logit contributions
 risks-0.042
 tal-0.042
 fault-0.042
 unnecessary-0.042
 muscles-0.041
Token<|endoftext|>
Token ID50256
Feature activation-1.62e-04

Pos logit contributions
 next+0.002
 timer+0.002
 Avery+0.002
 bride+0.002
iza+0.002
Neg logit contributions
OO-0.003
 fault-0.003
itten-0.003
 tal-0.003
ilst-0.002
TokenOnce
Token ID7454
Feature activation-6.87e-03

Pos logit contributions
 timer+0.000
 next+0.000
 bride+0.000
 Claus+0.000
 Avery+0.000
Neg logit contributions
op-0.000
itten-0.000
 Ow-0.000
 happened-0.000
OO-0.000
Token upon
Token ID2402
Feature activation+3.89e-03

Pos logit contributions
 next+0.007
 Avery+0.005
 timer+0.005
 Anna+0.005
 Claus+0.005
Neg logit contributions
itten-0.013
 unnecessary-0.013
OO-0.013
 disease-0.013
op-0.013
Token be
Token ID307
Feature activation+4.79e-04

Pos logit contributions
itten+0.013
 disease+0.013
 unnecessary+0.012
 risks+0.012
op+0.012
Neg logit contributions
 next-0.014
 Avery-0.012
 Claus-0.012
 timer-0.012
Santa-0.012
Token the
Token ID262
Feature activation-1.62e-02

Pos logit contributions
 unnecessary+0.001
 risks+0.001
 muscles+0.001
 disease+0.001
 fault+0.001
Neg logit contributions
 timer-0.001
 next-0.001
 Avery-0.001
 Claus-0.001
oe-0.001
Token best
Token ID1266
Feature activation-7.52e-03

Pos logit contributions
 Avery+0.023
 next+0.023
oe+0.021
iza+0.021
Santa+0.021
Neg logit contributions
 unnecessary-0.021
 fault-0.020
 disease-0.020
 risks-0.020
itten-0.019
Token,
Token ID11
Feature activation-1.40e-02

Pos logit contributions
 next+0.010
 Avery+0.009
Santa+0.009
 timer+0.008
 Claus+0.008
Neg logit contributions
op-0.012
itten-0.012
 fault-0.011
 disease-0.011
OO-0.011
Token so
Token ID523
Feature activation+1.33e-03

Pos logit contributions
 next+0.021
 Avery+0.020
 Claus+0.019
oe+0.019
 timer+0.019
Neg logit contributions
 unnecessary-0.018
itten-0.018
 risks-0.018
 fault-0.018
 disease-0.018
Token he
Token ID339
Feature activation-4.21e-02

Pos logit contributions
itten+0.002
 unnecessary+0.002
op+0.002
OO+0.002
 disease+0.002
Neg logit contributions
 next-0.002
 Avery-0.002
 Claus-0.002
 timer-0.002
 Maya-0.002
Token went
Token ID1816
Feature activation-2.62e-02

Pos logit contributions
 next+0.062
 Avery+0.058
 Amen+0.057
 smiling+0.055
 timer+0.054
Neg logit contributions
 pierced-0.054
ched-0.050
eware-0.050
op-0.050
OO-0.050
Token on
Token ID319
Feature activation+2.47e-02

Pos logit contributions
 Amen+0.045
 Avery+0.045
iza+0.044
 We+0.042
maid+0.042
Neg logit contributions
 exercise-0.023
 risks-0.023
 unnecessary-0.022
 disease-0.022
 hitting-0.021
Token stage
Token ID3800
Feature activation-3.80e-02

Pos logit contributions
 unnecessary+0.034
 fault+0.033
OO+0.033
 disease+0.033
 risks+0.033
Neg logit contributions
 next-0.035
 Claus-0.032
 timer-0.031
 Avery-0.031
iza-0.030
Token.
Token ID13
Feature activation-7.31e-03

Pos logit contributions
 Amen+0.044
iza+0.042
Santa+0.042
 timer+0.041
 Avery+0.041
Neg logit contributions
op-0.056
 fault-0.056
 bars-0.054
eware-0.053
OO-0.053
Token He
Token ID679
Feature activation-7.84e-03

Pos logit contributions
 next+0.009
 Avery+0.008
 timer+0.007
 Claus+0.007
Santa+0.007
Neg logit contributions
itten-0.012
OO-0.012
 unnecessary-0.012
op-0.012
 fault-0.011
Token
Token ID198
Feature activation+2.23e-02

Pos logit contributions
op+0.020
 Everything+0.019
itten+0.019
OO+0.019
 fault+0.019
Neg logit contributions
 next-0.022
 timer-0.021
 Claus-0.019
 keeper-0.019
guard-0.018
Token
Token ID198
Feature activation+1.37e-02

Pos logit contributions
itten+0.037
op+0.036
OO+0.035
 tal+0.034
 unnecessary+0.034
Neg logit contributions
 next-0.028
 timer-0.024
 Claus-0.024
 Avery-0.023
 lad-0.021
TokenAfter
Token ID3260
Feature activation+1.28e-02

Pos logit contributions
op+0.020
itten+0.020
OO+0.018
ending+0.018
apy+0.018
Neg logit contributions
 next-0.019
 Claus-0.019
 Avery-0.017
 timer-0.017
 Anna-0.016
Tokenwards
Token ID2017
Feature activation+4.64e-03

Pos logit contributions
itten+0.016
op+0.016
apy+0.016
 unnecessary+0.016
 disease+0.016
Neg logit contributions
 next-0.019
 Claus-0.017
 timer-0.017
 Avery-0.017
oe-0.016
Token,
Token ID11
Feature activation+3.61e-02

Pos logit contributions
op+0.005
itten+0.005
apy+0.005
ops+0.005
 tum+0.004
Neg logit contributions
 next-0.007
 Claus-0.007
 timer-0.007
 Amen-0.007
Santa-0.007
Token Daddy
Token ID26926
Feature activation-6.63e-03

Pos logit contributions
apy+0.051
op+0.048
 tum+0.047
itten+0.046
 muscles+0.045
Neg logit contributions
 Amen-0.046
 next-0.046
 Claus-0.045
 timer-0.045
ider-0.045
Token hugged
Token ID41130
Feature activation-2.70e-02

Pos logit contributions
ider+0.009
Cl+0.009
 Amen+0.009
 next+0.009
 Avery+0.008
Neg logit contributions
op-0.009
itten-0.009
ops-0.008
 happened-0.008
 pierced-0.008
Token his
Token ID465
Feature activation+1.30e-02

Pos logit contributions
ider+0.046
 Claus+0.041
 timer+0.041
 next+0.037
anut+0.037
Neg logit contributions
 tum-0.029
 Everything-0.026
 disease-0.026
 Ow-0.026
op-0.026
Token little
Token ID1310
Feature activation-1.01e-02

Pos logit contributions
 muscles+0.014
 tum+0.013
 habit+0.013
 cob+0.013
 worries+0.012
Neg logit contributions
Santa-0.022
amer-0.022
Say-0.021
Anna-0.021
Cl-0.021
Token girl
Token ID2576
Feature activation-3.20e-03

Pos logit contributions
Santa+0.014
 Claus+0.013
Say+0.013
 Amen+0.013
Mr+0.012
Neg logit contributions
 tum-0.014
 muscles-0.014
 fault-0.014
inter-0.014
itten-0.014
Token and
Token ID290
Feature activation+2.82e-03

Pos logit contributions
 timer+0.004
Santa+0.004
ider+0.004
Anna+0.004
iza+0.004
Neg logit contributions
itten-0.005
op-0.005
 tal-0.004
 muscles-0.004
age-0.004
Token.
Token ID13
Feature activation-1.87e-02

Pos logit contributions
itten+0.012
 unnecessary+0.012
op+0.012
 disease+0.012
 risks+0.012
Neg logit contributions
 next-0.012
 timer-0.010
 Claus-0.010
 Avery-0.009
Santa-0.009
Token We
Token ID775
Feature activation+4.90e-03

Pos logit contributions
 timer+0.031
 next+0.030
 Claus+0.028
maid+0.027
 bride+0.027
Neg logit contributions
op-0.022
 Everything-0.022
itten-0.021
apy-0.020
 muscles-0.020
Token can
Token ID460
Feature activation-1.70e-02

Pos logit contributions
itten+0.006
 unnecessary+0.005
op+0.005
 fault+0.005
OO+0.005
Neg logit contributions
 next-0.009
 Avery-0.008
 timer-0.008
 Claus-0.007
Santa-0.007
Token come
Token ID1282
Feature activation-2.27e-02

Pos logit contributions
 next+0.027
 Avery+0.026
 Amber+0.025
 waiter+0.025
Cl+0.025
Neg logit contributions
op-0.020
itten-0.019
 spike-0.018
 fault-0.018
age-0.018
Token and
Token ID290
Feature activation-1.78e-02

Pos logit contributions
iza+0.039
 timer+0.038
 Avery+0.037
 waiter+0.037
 lad+0.036
Neg logit contributions
 unnecessary-0.023
age-0.021
're-0.021
 shelter-0.021
eware-0.021
Token visit
Token ID3187
Feature activation-4.72e-02

Pos logit contributions
 next+0.027
 Avery+0.026
iza+0.025
 timer+0.025
Cl+0.025
Neg logit contributions
 unnecessary-0.020
 risks-0.019
 disease-0.019
 spike-0.019
 muscles-0.019
Token it
Token ID340
Feature activation+8.50e-03

Pos logit contributions
ider+0.066
 timer+0.065
 Claus+0.059
 next+0.059
 lad+0.058
Neg logit contributions
 unnecessary-0.065
 risks-0.061
 tal-0.059
OO-0.056
 disease-0.055
Token every
Token ID790
Feature activation-1.91e-02

Pos logit contributions
OO+0.012
 fault+0.012
age+0.012
 unnecessary+0.011
 bars+0.011
Neg logit contributions
iza-0.011
 Avery-0.011
 lad-0.010
 waiter-0.009
 timer-0.009
Token day
Token ID1110
Feature activation-2.90e-02

Pos logit contributions
 Avery+0.035
 next+0.034
 Claus+0.033
 Ada+0.032
iza+0.032
Neg logit contributions
 disease-0.018
itten-0.018
 unnecessary-0.016
OO-0.016
 cob-0.016
Token."
Token ID526
Feature activation-1.23e-02

Pos logit contributions
 timer+0.036
 waiter+0.035
Friend+0.034
 seller+0.033
 keeper+0.033
Neg logit contributions
op-0.040
 risks-0.038
 unnecessary-0.038
're-0.038
OO-0.038
Token
Token ID198
Feature activation-8.21e-04

Pos logit contributions
 timer+0.023
ider+0.022
 next+0.022
 Claus+0.021
rant+0.021
Neg logit contributions
op-0.011
 Nut-0.010
apy-0.010
 tal-0.010
 Ow-0.010
Token and
Token ID290
Feature activation+4.63e-04

Pos logit contributions
itten+0.010
op+0.009
OO+0.009
ilst+0.009
 cob+0.009
Neg logit contributions
 next-0.005
 Avery-0.004
 Amen-0.004
 timer-0.004
Santa-0.004
Token Max
Token ID5436
Feature activation-1.36e-02

Pos logit contributions
itten+0.001
 unnecessary+0.001
OO+0.001
op+0.001
 disease+0.001
Neg logit contributions
 next-0.001
 timer-0.001
 Avery-0.000
 Claus-0.000
Santa-0.000
Token was
Token ID373
Feature activation+1.55e-02

Pos logit contributions
 next+0.020
 Avery+0.018
 timer+0.017
Santa+0.016
 Claus+0.016
Neg logit contributions
itten-0.019
OO-0.019
op-0.018
 unnecessary-0.018
 disease-0.018
Token the
Token ID262
Feature activation-1.35e-02

Pos logit contributions
 unnecessary+0.025
itten+0.024
op+0.024
OO+0.024
 cob+0.024
Neg logit contributions
 next-0.018
 timer-0.017
 Avery-0.016
 Claus-0.016
Santa-0.016
Token h
Token ID289
Feature activation-1.61e-02

Pos logit contributions
 next+0.020
 Avery+0.017
 timer+0.016
 Claus+0.016
 Anna+0.016
Neg logit contributions
itten-0.020
 unnecessary-0.020
OO-0.019
 disease-0.019
 fault-0.019
Tokenider
Token ID1304
Feature activation-5.89e-02

Pos logit contributions
 next+0.010
 Avery+0.006
 timer+0.005
 Claus+0.005
 Anna+0.005
Neg logit contributions
itten-0.038
OO-0.037
 unnecessary-0.037
op-0.037
 disease-0.036
Token.
Token ID13
Feature activation-3.82e-03

Pos logit contributions
 next+0.048
 Avery+0.041
 timer+0.039
 Claus+0.039
Santa+0.037
Neg logit contributions
itten-0.121
OO-0.120
 fault-0.118
op-0.117
 unnecessary-0.117
Token Lily
Token ID20037
Feature activation-1.96e-02

Pos logit contributions
 next+0.005
 Avery+0.004
 Anna+0.004
 timer+0.003
 Claus+0.003
Neg logit contributions
itten-0.007
 unnecessary-0.007
OO-0.007
 disease-0.006
 fault-0.006
Token counted
Token ID14789
Feature activation-6.70e-03

Pos logit contributions
 next+0.028
 Avery+0.024
 timer+0.024
Santa+0.023
 Claus+0.023
Neg logit contributions
itten-0.029
OO-0.028
op-0.027
 unnecessary-0.027
 pierced-0.026
Token to
Token ID284
Feature activation+1.33e-02

Pos logit contributions
 next+0.009
 Avery+0.007
 timer+0.007
 Claus+0.007
 Anna+0.006
Neg logit contributions
itten-0.011
 unnecessary-0.011
OO-0.011
op-0.011
 disease-0.011
Token ten
Token ID3478
Feature activation-2.79e-02

Pos logit contributions
itten+0.019
 unnecessary+0.019
OO+0.019
 disease+0.018
 fault+0.018
Neg logit contributions
 next-0.019
 Avery-0.016
 timer-0.016
 Claus-0.016
Santa-0.015
Token Jimmy
Token ID12963
Feature activation-3.15e-02

Pos logit contributions
 next+0.044
 timer+0.041
 Avery+0.041
 Amen+0.040
iza+0.039
Neg logit contributions
itten-0.019
OO-0.018
op-0.018
 fault-0.017
 Ow-0.017
Token.
Token ID13
Feature activation+9.91e-03

Pos logit contributions
 next+0.029
 timer+0.027
 Avery+0.025
 Maya+0.022
 Amber+0.022
Neg logit contributions
op-0.061
itten-0.061
OO-0.061
 fault-0.058
 disease-0.057
Token He
Token ID679
Feature activation-8.05e-03

Pos logit contributions
itten+0.016
OO+0.016
 fault+0.016
 muscles+0.015
op+0.015
Neg logit contributions
 next-0.012
 timer-0.010
 Claus-0.010
 Avery-0.010
iza-0.010
Token loved
Token ID6151
Feature activation-3.44e-02

Pos logit contributions
 next+0.013
 Avery+0.012
 timer+0.012
 Claus+0.011
 Amber+0.011
Neg logit contributions
itten-0.010
OO-0.010
op-0.009
 pierced-0.009
 disease-0.009
Token to
Token ID284
Feature activation+2.22e-02

Pos logit contributions
 next+0.042
 smiling+0.030
 Avery+0.030
 Anna+0.029
 timer+0.029
Neg logit contributions
itten-0.064
OO-0.064
 unnecessary-0.062
eware-0.061
op-0.060
Token visit
Token ID3187
Feature activation-4.71e-02

Pos logit contributions
itten+0.032
OO+0.030
 unnecessary+0.030
op+0.030
 disease+0.029
Neg logit contributions
 next-0.034
 Avery-0.029
 timer-0.028
 Claus-0.028
 Anna-0.027
Token the
Token ID262
Feature activation-6.71e-03

Pos logit contributions
 next+0.055
 Avery+0.045
 timer+0.043
 Claus+0.042
 Anna+0.041
Neg logit contributions
itten-0.083
 unnecessary-0.081
OO-0.081
 disease-0.079
op-0.079
Token stadium
Token ID10308
Feature activation-2.85e-02

Pos logit contributions
 Avery+0.010
oe+0.009
 Amen+0.009
 Claus+0.009
 next+0.009
Neg logit contributions
 cob-0.008
itten-0.008
 spike-0.008
 hamm-0.008
 disease-0.007
Token and
Token ID290
Feature activation+7.34e-03

Pos logit contributions
 timer+0.038
 Avery+0.037
 Amen+0.037
maid+0.037
Santa+0.036
Neg logit contributions
op-0.038
age-0.036
ilst-0.036
 bars-0.035
eware-0.034
Token watch
Token ID2342
Feature activation-4.11e-03

Pos logit contributions
itten+0.010
 unnecessary+0.009
 disease+0.009
OO+0.009
op+0.009
Neg logit contributions
 next-0.011
 Avery-0.010
 timer-0.010
oe-0.010
 Claus-0.009
Token the
Token ID262
Feature activation-1.71e-02

Pos logit contributions
 next+0.005
 Claus+0.005
oe+0.005
 timer+0.004
 Avery+0.004
Neg logit contributions
 unnecessary-0.006
OO-0.006
 risks-0.006
itten-0.006
 muscles-0.006
Token the
Token ID262
Feature activation+8.31e-03

Pos logit contributions
 Claus+0.025
 waiter+0.023
 timer+0.023
ory+0.023
 Avery+0.022
Neg logit contributions
 risks-0.024
 unnecessary-0.023
 cob-0.023
 tal-0.023
 fault-0.022
Token sky
Token ID6766
Feature activation-1.22e-02

Pos logit contributions
 unnecessary+0.010
 risks+0.010
 tal+0.010
itten+0.010
 cob+0.010
Neg logit contributions
 next-0.013
 Avery-0.013
 Claus-0.012
 timer-0.012
 Maya-0.012
Token!"
Token ID2474
Feature activation-5.86e-03

Pos logit contributions
Friend+0.014
 Amen+0.014
 waiter+0.014
Say+0.014
 Avery+0.014
Neg logit contributions
 fault-0.019
OO-0.018
itten-0.018
op-0.017
age-0.017
Token
Token ID198
Feature activation+3.52e-03

Pos logit contributions
 next+0.010
 timer+0.009
ider+0.009
 Claus+0.009
 bride+0.008
Neg logit contributions
op-0.008
itten-0.007
 tal-0.007
apy-0.007
 Nut-0.006
Token
Token ID198
Feature activation-2.92e-03

Pos logit contributions
op+0.005
itten+0.005
 tal+0.005
age+0.004
apy+0.004
Neg logit contributions
 next-0.005
 Claus-0.005
 timer-0.005
ider-0.005
 smiling-0.004
TokenWith
Token ID3152
Feature activation-4.81e-02

Pos logit contributions
 next+0.005
 Claus+0.004
 smiling+0.004
 thanks+0.004
 timer+0.004
Neg logit contributions
op-0.004
itten-0.004
ending-0.003
age-0.003
OO-0.003
Token her
Token ID607
Feature activation+8.81e-03

Pos logit contributions
 next+0.061
 Avery+0.050
 timer+0.049
 Claus+0.048
Santa+0.046
Neg logit contributions
itten-0.080
 unnecessary-0.078
OO-0.077
 disease-0.076
 fault-0.075
Token parents
Token ID3397
Feature activation-4.08e-03

Pos logit contributions
itten+0.012
 muscles+0.011
 tal+0.011
 fault+0.011
 unnecessary+0.011
Neg logit contributions
 next-0.013
Cl-0.013
 Avery-0.013
 Claus-0.013
Anna-0.013
Token,
Token ID11
Feature activation+2.84e-02

Pos logit contributions
 next+0.006
 waiter+0.005
 Avery+0.005
Anna+0.005
 timer+0.005
Neg logit contributions
itten-0.006
op-0.006
 tal-0.005
ending-0.005
age-0.005
Token Lucy
Token ID22162
Feature activation+1.03e-03

Pos logit contributions
itten+0.037
op+0.037
 muscles+0.036
 unnecessary+0.035
 cob+0.035
Neg logit contributions
 next-0.041
 Amen-0.039
 timer-0.039
ider-0.037
Santa-0.036
Token travelled
Token ID21650
Feature activation-1.95e-02

Pos logit contributions
op+0.001
itten+0.001
age+0.001
ending+0.001
OO+0.001
Neg logit contributions
 next-0.001
 timer-0.001
 Amen-0.001
 waiter-0.001
 menus-0.001
Token surprise
Token ID5975
Feature activation-1.97e-02

Pos logit contributions
itten+0.035
 disease+0.033
OO+0.033
 fault+0.033
 unnecessary+0.032
Neg logit contributions
 next-0.043
 Avery-0.038
 Claus-0.037
Santa-0.036
 timer-0.036
Token for
Token ID329
Feature activation-4.74e-03

Pos logit contributions
 Navy+0.034
 Avery+0.034
 Claus+0.033
Friend+0.033
Santa+0.032
Neg logit contributions
itten-0.019
 fault-0.017
op-0.017
 infection-0.017
 disease-0.016
Token you
Token ID345
Feature activation-1.68e-02

Pos logit contributions
 next+0.005
 Claus+0.005
 Avery+0.004
oe+0.004
Santa+0.004
Neg logit contributions
 unnecessary-0.008
itten-0.008
 disease-0.008
 cob-0.008
OO-0.008
Token.
Token ID13
Feature activation+3.76e-03

Pos logit contributions
 timer+0.019
iza+0.018
 Avery+0.018
Friend+0.017
 Navy+0.017
Neg logit contributions
 fault-0.028
OO-0.027
're-0.026
op-0.026
eware-0.025
Token Come
Token ID7911
Feature activation-2.96e-02

Pos logit contributions
itten+0.005
op+0.005
eware+0.005
 Everything+0.005
ending+0.005
Neg logit contributions
 next-0.005
 timer-0.005
 menus-0.004
oe-0.004
 Avery-0.004
Token with
Token ID351
Feature activation-5.13e-02

Pos logit contributions
 timer+0.037
iza+0.037
 Avery+0.037
 waiter+0.036
Cl+0.035
Neg logit contributions
op-0.043
itten-0.041
age-0.041
 shelter-0.036
OO-0.036
Token me
Token ID502
Feature activation-1.75e-03

Pos logit contributions
 next+0.053
 Avery+0.047
 Claus+0.047
 timer+0.045
oe+0.044
Neg logit contributions
itten-0.090
 unnecessary-0.088
OO-0.088
 disease-0.088
 risks-0.084
Token."
Token ID526
Feature activation-6.26e-03

Pos logit contributions
 timer+0.002
Say+0.002
 Avery+0.002
iza+0.002
 Navy+0.002
Neg logit contributions
age-0.003
op-0.003
OO-0.003
itten-0.003
 fault-0.003
Token
Token ID198
Feature activation+2.82e-03

Pos logit contributions
 next+0.008
 timer+0.007
 Avery+0.006
 Claus+0.006
 waiter+0.006
Neg logit contributions
itten-0.010
op-0.010
OO-0.010
 tal-0.010
inter-0.009
Token
Token ID198
Feature activation-3.07e-03

Pos logit contributions
itten+0.005
op+0.005
OO+0.005
 tal+0.005
age+0.005
Neg logit contributions
 next-0.003
 Avery-0.003
 timer-0.003
 Claus-0.003
 waiter-0.002
TokenThe
Token ID464
Feature activation-2.39e-02

Pos logit contributions
 next+0.004
 Avery+0.003
 Claus+0.003
 timer+0.003
oe+0.003
Neg logit contributions
itten-0.005
op-0.005
 unnecessary-0.005
OO-0.005
 disease-0.005