TOP ACTIVATIONS
MAX = 0.128

into all the n ooks and cr ann ies .
. She grabbed the fire extingu isher and put out the
her up in her arms and said " Don 't worry
she held onto her mum and started to cry . Her
took her in her arms and reminded her to always stay
girl up in her arms and the dog finally went away
and James gathered some sticks and stones , and began their
Tim 's mom put butter and jam on it , just
ora thanked the old man and said goodbye . From then
gripped Alex 's tail tightly and looked up . He saw
They put on their coats and gloves and run outside .
Lily took out the plates and cups from the cup board
her grandma a big hug and shared her cookies with her
Jack put on their coats and hats and went outside with
She gave James a hug and put her phone back in
pat ?" He showed Ben and Lily how to pat the
mom handed her some soap and said , " Make sure
. Amanda thanked her mom and then resumed her racing around
They put on their coats and hats and go to the
. He thanked his Mama and got ready to play with

MIN ACTIVATIONS
MIN = -0.165

. " Oh no ! Where is my purse ?" she
box . " Where is my hat and coat
her mom my , " Where are we going ?" Her
" Let 's go ! Where shall we go today ?"
" Oh , no ! Where did you lose it ?"
sad . " Where is the fish ?" Tom
someone called out : " Where do you think you 're
empty . " Where is everyone ?" James asked
one . " Where is the key ?" Ben
goes . " Where does the slide go ?"
" Hello , puppy ! Where did you come from ?"
, " Oh my ! Where did you come from ?"
woods . " Where did it go ?" Tom
tree . " Where is the ball , Spot
part . " Where is it ?" Anna wonders
, " Hi Tom ! Where did you find that onion
Oh no , Lily ! Where is the ball ?" Tom
of them asked , " Where have you been ?" The
heart . " Where is the heart piece ,
garden . " Where are you going ?" asked

INTERVAL 0.055 - 0.128
CONTAINS 5.202%

. He shook his head and said , " We all
paint and paper . Tim painted a picture of the city
happy brothers . <|endoftext|> Jen and Ben were twins who liked
and they soon found many ac orns together . But ,
A man with a hat and a a pron is inside

INTERVAL -0.019 - 0.055
CONTAINS 71.184%

unlocked trees , and lots of other things too .
It was a gen ie ! The gen ie
ever the same again . <|endoftext|> Once upon a time ,
clips . When he finished , Joe stepped back
she stayed there for quite some time , playing with the

INTERVAL -0.092 - -0.019
CONTAINS 23.282%

man soup . They wish they had given him something icy
not hide magnets in the salad . They are not good
, " That 's just a pretend monster , Tim my
are very lucky to be able to read . Not all
umpy fisherman . He did not like to talk to anyone

INTERVAL -0.165 - -0.092
CONTAINS 0.331%

need Sam . He is too heavy to go up the
. I won 't do it again . I 'm sorry
. He realized he was wrong . He said , "
't . The needle is too big and the cloth is
. He said , " What are you looking at ,
Token into
Token ID656
Feature activation+1.68e-02

Pos logit contributions
Offic+0.046
years+0.045
 chatting+0.043
 airplanes+0.042
 terminal+0.041
Neg logit contributions
 am-0.035
 hurts-0.030
 loser-0.029
kin-0.029
 fault-0.028
Token all
Token ID477
Feature activation+4.50e-02

Pos logit contributions
years+0.035
Offic+0.033
 chatting+0.033
 airplanes+0.032
 terminal+0.031
Neg logit contributions
 am-0.026
 loser-0.023
kin-0.022
 hurts-0.022
ses-0.021
Token the
Token ID262
Feature activation+8.17e-03

Pos logit contributions
years+0.083
 airplanes+0.080
 chatting+0.079
Offic+0.077
 travelling+0.074
Neg logit contributions
 am-0.079
 loser-0.072
ses-0.069
kin-0.068
 hurts-0.067
Token n
Token ID299
Feature activation+4.09e-02

Pos logit contributions
years+0.013
 airplanes+0.013
 chatting+0.012
Offic+0.012
 travelling+0.012
Neg logit contributions
 am-0.016
 loser-0.015
ses-0.015
kin-0.014
 hurts-0.014
Tokenooks
Token ID31085
Feature activation+1.81e-02

Pos logit contributions
years+0.072
 chatting+0.066
 airplanes+0.066
Offic+0.065
 travelling+0.062
Neg logit contributions
 am-0.077
 loser-0.069
 hurts-0.067
ses-0.067
kin-0.065
Token and
Token ID290
Feature activation+1.28e-01

Pos logit contributions
years+0.037
Offic+0.036
 airplanes+0.035
 chatting+0.034
 terminal+0.034
Neg logit contributions
 am-0.028
 loser-0.025
kin-0.024
 hurts-0.023
ses-0.023
Token cr
Token ID1067
Feature activation+6.73e-02

Pos logit contributions
years+0.241
 chatting+0.234
Offic+0.231
 airplanes+0.222
 travelling+0.219
Neg logit contributions
 am-0.217
 hurts-0.187
 loser-0.186
ses-0.181
kin-0.178
Tokenann
Token ID1236
Feature activation+7.32e-02

Pos logit contributions
Offic+0.089
years+0.089
 chatting+0.087
 inviting+0.083
 travelling+0.082
Neg logit contributions
 am-0.147
 loser-0.135
 hurts-0.134
ses-0.133
 fault-0.131
Tokenies
Token ID444
Feature activation+5.65e-02

Pos logit contributions
years+0.117
 airplanes+0.110
Offic+0.109
 chatting+0.109
 travelling+0.102
Neg logit contributions
 am-0.146
 loser-0.132
ses-0.127
kin-0.127
 hurts-0.127
Token.
Token ID13
Feature activation+5.71e-03

Pos logit contributions
years+0.141
Offic+0.135
 airplanes+0.134
 chatting+0.133
 travelling+0.128
Neg logit contributions
 am-0.063
 loser-0.051
ses-0.048
kin-0.048
 hurts-0.048
Token
Token ID198
Feature activation+2.21e-02

Pos logit contributions
years+0.010
 chatting+0.010
 airplanes+0.010
Offic+0.010
 inviting+0.009
Neg logit contributions
 am-0.010
 loser-0.009
ses-0.009
kin-0.009
 hurts-0.009
Token.
Token ID13
Feature activation-1.26e-03

Pos logit contributions
 am+0.039
 loser+0.033
ses+0.033
kin+0.031
 hurts+0.031
Neg logit contributions
years-0.071
 airplanes-0.069
 chatting-0.069
Offic-0.067
 travelling-0.066
Token She
Token ID1375
Feature activation+1.57e-02

Pos logit contributions
 am+0.002
kin+0.002
 loser+0.002
ses+0.002
ger+0.002
Neg logit contributions
years-0.002
 chatting-0.002
Offic-0.002
 airplanes-0.002
 cities-0.002
Token grabbed
Token ID13176
Feature activation+6.42e-02

Pos logit contributions
years+0.028
 airplanes+0.027
 chatting+0.027
Offic+0.027
 travelling+0.025
Neg logit contributions
 am-0.028
 loser-0.025
 hurts-0.024
ses-0.024
kin-0.024
Token the
Token ID262
Feature activation+1.86e-02

Pos logit contributions
years+0.119
 airplanes+0.113
 chatting+0.112
Offic+0.111
 travelling+0.106
Neg logit contributions
 am-0.113
 loser-0.102
ses-0.097
kin-0.096
 hurts-0.096
Token fire
Token ID2046
Feature activation+2.61e-02

Pos logit contributions
 airplanes+0.030
 chatting+0.028
years+0.028
Offic+0.028
 travelling+0.028
Neg logit contributions
 am-0.036
 loser-0.032
kin-0.032
ses-0.031
ro-0.031
Token extingu
Token ID24995
Feature activation+1.27e-01

Pos logit contributions
 airplanes+0.046
 chatting+0.044
years+0.044
Offic+0.042
 travelling+0.041
Neg logit contributions
 am-0.047
 loser-0.043
ses-0.041
kin-0.041
 hurts-0.040
Tokenisher
Token ID4828
Feature activation-9.69e-03

Pos logit contributions
years+0.257
 airplanes+0.247
 chatting+0.247
Offic+0.246
 travelling+0.235
Neg logit contributions
 am-0.197
 loser-0.173
 hurts-0.168
ses-0.165
kin-0.162
Token and
Token ID290
Feature activation+1.07e-01

Pos logit contributions
 am+0.017
 loser+0.015
ses+0.015
kin+0.015
 hurts+0.015
Neg logit contributions
years-0.018
 airplanes-0.017
 chatting-0.017
Offic-0.016
 travelling-0.016
Token put
Token ID1234
Feature activation+8.00e-02

Pos logit contributions
years+0.204
Offic+0.182
 chatting+0.182
 airplanes+0.178
 cities+0.176
Neg logit contributions
 am-0.183
 hurts-0.161
kin-0.157
ses-0.154
ger-0.151
Token out
Token ID503
Feature activation+1.42e-03

Pos logit contributions
years+0.147
 airplanes+0.140
 chatting+0.138
Offic+0.137
 travelling+0.130
Neg logit contributions
 am-0.141
 loser-0.127
ses-0.122
 hurts-0.120
kin-0.120
Token the
Token ID262
Feature activation+1.93e-02

Pos logit contributions
years+0.003
Offic+0.002
 chatting+0.002
 airplanes+0.002
 travelling+0.002
Neg logit contributions
 am-0.002
 loser-0.002
 hurts-0.002
kin-0.002
ses-0.002
Token her
Token ID607
Feature activation+2.96e-02

Pos logit contributions
years+0.150
Offic+0.144
 chatting+0.142
 airplanes+0.141
 travelling+0.135
Neg logit contributions
 am-0.113
 loser-0.098
 hurts-0.096
ses-0.094
kin-0.092
Token up
Token ID510
Feature activation+3.31e-03

Pos logit contributions
years+0.054
 airplanes+0.053
 chatting+0.051
 travelling+0.049
 inviting+0.048
Neg logit contributions
 am-0.051
 loser-0.050
ses-0.047
kin-0.044
ger-0.043
Token in
Token ID287
Feature activation+1.77e-02

Pos logit contributions
years+0.006
 airplanes+0.006
 chatting+0.006
Offic+0.006
 travelling+0.005
Neg logit contributions
 am-0.006
 loser-0.005
ses-0.005
kin-0.005
 hurts-0.005
Token her
Token ID607
Feature activation+4.05e-02

Pos logit contributions
years+0.038
 airplanes+0.036
Offic+0.036
 chatting+0.035
 terminal+0.034
Neg logit contributions
 am-0.027
 loser-0.022
kin-0.022
 hurts-0.021
ses-0.021
Token arms
Token ID5101
Feature activation+2.29e-02

Pos logit contributions
years+0.072
 airplanes+0.069
 chatting+0.068
Offic+0.067
 travelling+0.064
Neg logit contributions
 am-0.073
 loser-0.067
ses-0.063
kin-0.063
 hurts-0.063
Token and
Token ID290
Feature activation+1.24e-01

Pos logit contributions
years+0.041
 airplanes+0.040
 chatting+0.039
Offic+0.038
 travelling+0.037
Neg logit contributions
 am-0.041
 loser-0.037
ses-0.036
kin-0.035
 hurts-0.035
Token said
Token ID531
Feature activation+6.42e-02

Pos logit contributions
years+0.238
Offic+0.215
 cities+0.212
 chatting+0.211
 airplanes+0.211
Neg logit contributions
 am-0.211
 hurts-0.190
kin-0.181
ses-0.181
 loser-0.171
Token "
Token ID366
Feature activation+3.29e-02

Pos logit contributions
years+0.132
 airplanes+0.126
 chatting+0.125
Offic+0.119
 travelling+0.118
Neg logit contributions
 am-0.098
 loser-0.089
ses-0.087
kin-0.083
 hurts-0.082
TokenDon
Token ID3987
Feature activation-5.24e-02

Pos logit contributions
years+0.062
 airplanes+0.060
 chatting+0.059
Offic+0.058
 travelling+0.056
Neg logit contributions
 am-0.057
 loser-0.050
 hurts-0.048
kin-0.047
ses-0.046
Token't
Token ID470
Feature activation-4.34e-02

Pos logit contributions
 am+0.075
ses+0.075
 loser+0.071
kin+0.068
 fault+0.067
Neg logit contributions
years-0.105
Offic-0.104
 airplanes-0.103
 chatting-0.099
 inviting-0.097
Token worry
Token ID5490
Feature activation-6.91e-02

Pos logit contributions
 am+0.070
 loser+0.062
ses+0.061
kin+0.061
 hurts+0.057
Neg logit contributions
years-0.085
 airplanes-0.082
 chatting-0.082
Offic-0.080
 inviting-0.077
Token she
Token ID673
Feature activation+4.01e-04

Pos logit contributions
years+0.089
Offic+0.084
irs+0.081
 airplanes+0.081
ellery+0.079
Neg logit contributions
 am-0.048
 hurts-0.041
ses-0.039
ger-0.038
chy-0.038
Token held
Token ID2714
Feature activation+7.68e-02

Pos logit contributions
years+0.001
 airplanes+0.001
Offic+0.001
 chatting+0.001
 cities+0.001
Neg logit contributions
 am-0.001
 hurts-0.001
 loser-0.001
ses-0.001
ger-0.001
Token onto
Token ID4291
Feature activation+4.41e-02

Pos logit contributions
years+0.126
 airplanes+0.119
 chatting+0.117
 travelling+0.114
 inviting+0.111
Neg logit contributions
 am-0.147
 loser-0.141
ses-0.131
 fault-0.129
 hurts-0.128
Token her
Token ID607
Feature activation+3.18e-02

Pos logit contributions
years+0.084
Offic+0.079
 chatting+0.079
 airplanes+0.078
 travelling+0.073
Neg logit contributions
 am-0.076
 loser-0.066
ses-0.065
 hurts-0.064
kin-0.063
Token mum
Token ID25682
Feature activation+7.79e-02

Pos logit contributions
 chatting+0.051
 inviting+0.050
 airplanes+0.050
years+0.050
 travelling+0.049
Neg logit contributions
 am-0.058
 loser-0.057
ses-0.051
ger-0.051
ro-0.050
Token and
Token ID290
Feature activation+1.23e-01

Pos logit contributions
years+0.190
Offic+0.183
 airplanes+0.176
 chatting+0.175
 terminal+0.173
Neg logit contributions
 am-0.086
 hurts-0.071
ger-0.064
 loser-0.063
ny-0.063
Token started
Token ID2067
Feature activation+9.79e-03

Pos logit contributions
years+0.235
Offic+0.220
 airplanes+0.218
 libraries+0.213
 cities+0.211
Neg logit contributions
 am-0.210
kin-0.172
'm-0.171
 hurts-0.169
ses-0.163
Token to
Token ID284
Feature activation+1.82e-02

Pos logit contributions
years+0.022
Offic+0.021
 airplanes+0.020
 cities+0.020
 terminal+0.020
Neg logit contributions
 am-0.014
 hurts-0.011
'm-0.010
 fault-0.010
ses-0.009
Token cry
Token ID3960
Feature activation-1.69e-03

Pos logit contributions
years+0.040
 airplanes+0.036
Offic+0.036
 chatting+0.035
 cities+0.034
Neg logit contributions
 am-0.029
 hurts-0.024
 loser-0.023
ses-0.022
'm-0.021
Token.
Token ID13
Feature activation+2.08e-03

Pos logit contributions
 am+0.002
ses+0.002
 loser+0.002
kin+0.002
 hurts+0.002
Neg logit contributions
years-0.004
 airplanes-0.004
 chatting-0.004
 inviting-0.004
 travelling-0.004
Token Her
Token ID2332
Feature activation+2.45e-02

Pos logit contributions
years+0.003
 chatting+0.003
Offic+0.003
 airplanes+0.003
 travelling+0.003
Neg logit contributions
 am-0.004
 loser-0.004
kin-0.004
ses-0.003
 hurts-0.003
Token took
Token ID1718
Feature activation+7.68e-02

Pos logit contributions
years+0.071
yles+0.069
 libraries+0.064
 cities+0.064
nam+0.064
Neg logit contributions
 am-0.047
ger-0.044
ny-0.039
 sting-0.036
 hurts-0.035
Token her
Token ID607
Feature activation+3.00e-02

Pos logit contributions
years+0.122
 airplanes+0.115
 chatting+0.114
Offic+0.113
 travelling+0.105
Neg logit contributions
 am-0.156
 loser-0.139
ses-0.135
 hurts-0.135
kin-0.134
Token in
Token ID287
Feature activation+1.68e-02

Pos logit contributions
years+0.055
 airplanes+0.055
 chatting+0.053
Offic+0.052
 travelling+0.051
Neg logit contributions
 am-0.052
 loser-0.047
kin-0.046
ses-0.045
 hurts-0.043
Token her
Token ID607
Feature activation+4.57e-02

Pos logit contributions
years+0.035
 airplanes+0.033
Offic+0.033
 chatting+0.032
 terminal+0.031
Neg logit contributions
 am-0.027
 loser-0.022
ses-0.022
 hurts-0.022
kin-0.021
Token arms
Token ID5101
Feature activation+2.86e-02

Pos logit contributions
years+0.080
 airplanes+0.079
 chatting+0.077
Offic+0.076
 travelling+0.073
Neg logit contributions
 am-0.082
 loser-0.076
kin-0.072
ses-0.070
 hurts-0.069
Token and
Token ID290
Feature activation+1.22e-01

Pos logit contributions
years+0.058
Offic+0.056
 airplanes+0.054
 chatting+0.053
 cities+0.051
Neg logit contributions
 am-0.045
 loser-0.039
 hurts-0.038
kin-0.038
ses-0.038
Token reminded
Token ID14516
Feature activation+3.34e-02

Pos logit contributions
years+0.272
 cities+0.235
 kingdoms+0.229
irs+0.227
 libraries+0.222
Neg logit contributions
 am-0.183
 hurts-0.162
ses-0.141
 thing-0.138
ger-0.137
Token her
Token ID607
Feature activation+1.23e-02

Pos logit contributions
years+0.047
 airplanes+0.043
 chatting+0.042
Offic+0.042
 travelling+0.039
Neg logit contributions
 am-0.074
 loser-0.069
ses-0.067
kin-0.066
 hurts-0.066
Token to
Token ID284
Feature activation+5.23e-03

Pos logit contributions
 airplanes+0.021
years+0.020
 chatting+0.020
 travelling+0.019
 visiting+0.019
Neg logit contributions
 am-0.022
ses-0.021
 loser-0.021
kin-0.020
 hurts-0.019
Token always
Token ID1464
Feature activation+2.56e-02

Pos logit contributions
years+0.011
 chatting+0.011
 airplanes+0.011
Offic+0.011
 travelling+0.011
Neg logit contributions
 am-0.007
 loser-0.006
ses-0.006
kin-0.006
 hurts-0.006
Token stay
Token ID2652
Feature activation+2.02e-02

Pos logit contributions
years+0.058
 airplanes+0.056
 chatting+0.056
Offic+0.055
 travelling+0.053
Neg logit contributions
 am-0.034
 loser-0.029
ses-0.028
kin-0.027
 hurts-0.027
Token girl
Token ID2576
Feature activation-2.78e-03

Pos logit contributions
 airplanes+0.028
Offic+0.026
 chatting+0.026
years+0.025
 inviting+0.025
Neg logit contributions
 am-0.027
 loser-0.024
kin-0.023
ses-0.022
 fault-0.022
Token up
Token ID510
Feature activation+2.14e-03

Pos logit contributions
 am+0.005
 loser+0.004
ses+0.004
kin+0.004
 hurts+0.004
Neg logit contributions
years-0.005
 airplanes-0.005
 chatting-0.005
Offic-0.005
 travelling-0.005
Token in
Token ID287
Feature activation+8.64e-03

Pos logit contributions
years+0.004
 airplanes+0.004
 chatting+0.004
Offic+0.004
 travelling+0.004
Neg logit contributions
 am-0.004
 loser-0.003
ses-0.003
kin-0.003
 hurts-0.003
Token her
Token ID607
Feature activation+3.03e-02

Pos logit contributions
years+0.019
 airplanes+0.018
Offic+0.017
 chatting+0.017
 terminal+0.017
Neg logit contributions
 am-0.013
 loser-0.011
 hurts-0.011
ses-0.011
kin-0.011
Token arms
Token ID5101
Feature activation+2.19e-02

Pos logit contributions
 airplanes+0.046
 chatting+0.044
years+0.043
 travelling+0.040
 inviting+0.040
Neg logit contributions
 am-0.061
 loser-0.059
kin-0.052
ger-0.051
 fault-0.051
Token and
Token ID290
Feature activation+1.22e-01

Pos logit contributions
years+0.038
 airplanes+0.036
 chatting+0.036
Offic+0.034
 travelling+0.033
Neg logit contributions
 am-0.041
 loser-0.037
ses-0.036
kin-0.035
 hurts-0.035
Token the
Token ID262
Feature activation+3.06e-02

Pos logit contributions
years+0.212
 airplanes+0.191
Offic+0.189
 chatting+0.187
 cities+0.183
Neg logit contributions
 am-0.232
 hurts-0.208
ses-0.203
 loser-0.202
kin-0.200
Token dog
Token ID3290
Feature activation+3.06e-02

Pos logit contributions
 airplanes+0.063
years+0.062
Offic+0.061
 chatting+0.061
 inviting+0.058
Neg logit contributions
 am-0.046
 loser-0.039
kin-0.038
 fault-0.037
ses-0.036
Token finally
Token ID3443
Feature activation-5.03e-03

Pos logit contributions
years+0.058
 airplanes+0.055
 chatting+0.055
Offic+0.054
 travelling+0.052
Neg logit contributions
 am-0.052
 loser-0.046
ses-0.045
 hurts-0.044
kin-0.044
Token went
Token ID1816
Feature activation+4.38e-02

Pos logit contributions
 am+0.010
 loser+0.009
ses+0.009
kin+0.008
 hurts+0.008
Neg logit contributions
years-0.008
 chatting-0.008
 airplanes-0.008
Offic-0.008
 travelling-0.007
Token away
Token ID1497
Feature activation-3.73e-02

Pos logit contributions
years+0.092
 airplanes+0.088
 chatting+0.087
Offic+0.087
 travelling+0.083
Neg logit contributions
 am-0.067
 loser-0.058
ses-0.056
kin-0.055
 hurts-0.055
Token and
Token ID290
Feature activation+9.50e-02

Pos logit contributions
years+0.027
 airplanes+0.025
 chatting+0.025
Offic+0.025
 travelling+0.023
Neg logit contributions
 am-0.031
 loser-0.027
ses-0.027
kin-0.027
 hurts-0.026
Token James
Token ID3700
Feature activation+1.07e-02

Pos logit contributions
years+0.164
 airplanes+0.154
Offic+0.152
 chatting+0.151
 terminal+0.144
Neg logit contributions
 am-0.177
 loser-0.158
ses-0.152
 hurts-0.152
kin-0.151
Token gathered
Token ID9272
Feature activation+9.31e-02

Pos logit contributions
Offic+0.020
 airplanes+0.019
years+0.019
 libraries+0.019
 chatting+0.018
Neg logit contributions
 am-0.018
kin-0.015
 loser-0.014
'm-0.014
ny-0.014
Token some
Token ID617
Feature activation+3.81e-02

Pos logit contributions
years+0.172
Offic+0.162
 airplanes+0.162
 chatting+0.161
 travelling+0.152
Neg logit contributions
 am-0.165
 loser-0.145
ses-0.141
kin-0.140
 hurts-0.139
Token sticks
Token ID16461
Feature activation+3.10e-02

Pos logit contributions
years+0.071
Offic+0.067
 chatting+0.066
 airplanes+0.065
 terminal+0.062
Neg logit contributions
 am-0.068
 loser-0.059
 hurts-0.057
kin-0.057
ses-0.057
Token and
Token ID290
Feature activation+1.21e-01

Pos logit contributions
years+0.062
Offic+0.059
 airplanes+0.057
 chatting+0.057
 terminal+0.054
Neg logit contributions
 am-0.051
 loser-0.044
kin-0.043
ses-0.043
 hurts-0.042
Token stones
Token ID14966
Feature activation+4.01e-02

Pos logit contributions
years+0.231
Offic+0.222
 chatting+0.215
 terminal+0.209
 airplanes+0.206
Neg logit contributions
 am-0.200
kin-0.168
ses-0.167
 hurts-0.165
 loser-0.163
Token,
Token ID11
Feature activation+6.99e-02

Pos logit contributions
years+0.079
Offic+0.074
 airplanes+0.073
 chatting+0.073
 travelling+0.069
Neg logit contributions
 am-0.067
 loser-0.058
kin-0.056
ses-0.056
 hurts-0.055
Token and
Token ID290
Feature activation+1.01e-01

Pos logit contributions
Offic+0.136
years+0.136
 terminal+0.131
irs+0.126
 hangar+0.124
Neg logit contributions
ro-0.082
 am-0.080
ses-0.075
 frog-0.073
kin-0.071
Token began
Token ID2540
Feature activation+4.80e-03

Pos logit contributions
years+0.187
Offic+0.181
 terminal+0.168
 chatting+0.166
 travelling+0.162
Neg logit contributions
 am-0.155
kin-0.141
 thing-0.135
ger-0.133
 loser-0.133
Token their
Token ID511
Feature activation+1.71e-02

Pos logit contributions
years+0.012
Offic+0.011
 terminal+0.010
 libraries+0.010
 airplanes+0.010
Neg logit contributions
 am-0.006
 fault-0.004
 thing-0.004
ses-0.004
 hurts-0.004
Token Tim
Token ID5045
Feature activation+6.01e-03

Pos logit contributions
years+0.053
 airplanes+0.049
 chatting+0.049
Offic+0.049
 terminal+0.046
Neg logit contributions
 am-0.049
 loser-0.043
 hurts-0.041
kin-0.041
ses-0.040
Token's
Token ID338
Feature activation+1.70e-02

Pos logit contributions
years+0.011
 airplanes+0.010
 chatting+0.010
Offic+0.010
 travelling+0.009
Neg logit contributions
 am-0.011
 loser-0.010
ses-0.010
kin-0.010
 hurts-0.009
Token mom
Token ID1995
Feature activation+4.01e-02

Pos logit contributions
years+0.033
 airplanes+0.033
 chatting+0.032
Offic+0.031
 travelling+0.030
Neg logit contributions
 am-0.027
 loser-0.024
ses-0.023
ger-0.023
kin-0.022
Token put
Token ID1234
Feature activation+8.50e-02

Pos logit contributions
years+0.077
 airplanes+0.071
 cities+0.070
Offic+0.070
 travelling+0.068
Neg logit contributions
 am-0.070
 hurts-0.059
kin-0.057
'm-0.055
 loser-0.055
Token butter
Token ID9215
Feature activation+3.31e-02

Pos logit contributions
years+0.167
Offic+0.159
 chatting+0.158
 airplanes+0.158
 terminal+0.149
Neg logit contributions
 am-0.140
 loser-0.122
 hurts-0.118
kin-0.117
ses-0.117
Token and
Token ID290
Feature activation+1.21e-01

Pos logit contributions
years+0.063
 airplanes+0.060
 chatting+0.059
Offic+0.058
 travelling+0.056
Neg logit contributions
 am-0.057
 loser-0.050
ses-0.049
 hurts-0.048
kin-0.047
Token jam
Token ID16853
Feature activation+1.80e-02

Pos logit contributions
years+0.223
 airplanes+0.208
Offic+0.207
 chatting+0.207
 travelling+0.198
Neg logit contributions
 am-0.214
 loser-0.186
 hurts-0.185
kin-0.183
ses-0.182
Token on
Token ID319
Feature activation-3.34e-03

Pos logit contributions
years+0.033
 airplanes+0.032
 chatting+0.031
Offic+0.030
 travelling+0.029
Neg logit contributions
 am-0.031
 loser-0.028
ses-0.028
 hurts-0.027
 fault-0.026
Token it
Token ID340
Feature activation-4.10e-02

Pos logit contributions
 am+0.006
 loser+0.006
ses+0.005
kin+0.005
 hurts+0.005
Neg logit contributions
years-0.006
 airplanes-0.006
 chatting-0.005
Offic-0.005
 inviting-0.005
Token,
Token ID11
Feature activation+5.63e-02

Pos logit contributions
 am+0.054
 loser+0.049
ses+0.046
kin+0.043
 hurts+0.042
Neg logit contributions
years-0.092
 airplanes-0.089
 chatting-0.088
Offic-0.085
 travelling-0.084
Token just
Token ID655
Feature activation-1.88e-02

Pos logit contributions
years+0.119
Offic+0.110
 airplanes+0.107
 terminal+0.105
 chatting+0.102
Neg logit contributions
 am-0.087
 hurts-0.075
 loser-0.074
ses-0.073
kin-0.071
Tokenora
Token ID5799
Feature activation+3.59e-02

Pos logit contributions
years+0.029
 travelling+0.027
 chatting+0.027
ules+0.026
 visiting+0.026
Neg logit contributions
ses-0.031
 am-0.030
 hurts-0.029
 loser-0.028
 snake-0.028
Token thanked
Token ID26280
Feature activation+8.22e-02

Pos logit contributions
years+0.060
 airplanes+0.057
 chatting+0.057
Offic+0.056
 travelling+0.053
Neg logit contributions
 am-0.068
 loser-0.063
ses-0.061
 hurts-0.059
kin-0.059
Token the
Token ID262
Feature activation+2.70e-02

Pos logit contributions
years+0.153
Offic+0.145
 airplanes+0.145
 chatting+0.142
 terminal+0.136
Neg logit contributions
 am-0.143
 loser-0.123
kin-0.119
ses-0.119
 hurts-0.118
Token old
Token ID1468
Feature activation+2.68e-02

Pos logit contributions
years+0.050
 airplanes+0.050
Offic+0.048
 travelling+0.047
 chatting+0.047
Neg logit contributions
 am-0.046
 loser-0.042
ses-0.041
kin-0.040
 hurts-0.040
Token man
Token ID582
Feature activation+3.78e-02

Pos logit contributions
years+0.061
 chatting+0.057
 airplanes+0.057
Offic+0.057
 travelling+0.054
Neg logit contributions
 am-0.037
 loser-0.031
ses-0.030
 hurts-0.029
kin-0.029
Token and
Token ID290
Feature activation+1.21e-01

Pos logit contributions
years+0.072
 airplanes+0.068
Offic+0.068
 chatting+0.068
 terminal+0.064
Neg logit contributions
 am-0.064
 loser-0.056
ses-0.054
kin-0.054
 hurts-0.053
Token said
Token ID531
Feature activation+6.78e-02

Pos logit contributions
years+0.213
 airplanes+0.189
Offic+0.188
 chatting+0.184
 libraries+0.183
Neg logit contributions
 am-0.225
kin-0.193
 hurts-0.190
ger-0.190
ses-0.188
Token goodbye
Token ID24829
Feature activation+1.94e-02

Pos logit contributions
years+0.121
 airplanes+0.115
 chatting+0.114
Offic+0.112
 travelling+0.108
Neg logit contributions
 am-0.123
 loser-0.112
ses-0.109
 hurts-0.106
kin-0.105
Token.
Token ID13
Feature activation+5.92e-03

Pos logit contributions
years+0.040
 chatting+0.039
 airplanes+0.039
 travelling+0.037
Offic+0.037
Neg logit contributions
 am-0.030
 loser-0.026
 hurts-0.025
kin-0.025
ses-0.025
Token From
Token ID3574
Feature activation+4.53e-02

Pos logit contributions
years+0.010
 chatting+0.010
 airplanes+0.010
Offic+0.010
 travelling+0.009
Neg logit contributions
 am-0.011
 loser-0.010
ses-0.009
 hurts-0.009
kin-0.009
Token then
Token ID788
Feature activation-7.94e-04

Pos logit contributions
years+0.083
 airplanes+0.080
 chatting+0.078
Offic+0.077
 travelling+0.075
Neg logit contributions
 am-0.081
 loser-0.073
ses-0.070
 hurts-0.069
kin-0.069
Token gripped
Token ID42046
Feature activation+7.70e-02

Pos logit contributions
years+0.020
Offic+0.019
 airplanes+0.018
 chatting+0.018
 libraries+0.018
Neg logit contributions
 am-0.018
 loser-0.015
ger-0.015
kin-0.014
ses-0.014
Token Alex
Token ID4422
Feature activation+2.13e-02

Pos logit contributions
years+0.145
 airplanes+0.139
Offic+0.138
 chatting+0.137
 terminal+0.129
Neg logit contributions
 am-0.133
 loser-0.116
ses-0.114
kin-0.112
 hurts-0.111
Token's
Token ID338
Feature activation+4.59e-02

Pos logit contributions
years+0.043
 airplanes+0.041
 chatting+0.040
Offic+0.040
 travelling+0.038
Neg logit contributions
 am-0.034
 loser-0.030
ses-0.029
kin-0.028
 hurts-0.028
Token tail
Token ID7894
Feature activation+2.72e-02

Pos logit contributions
years+0.100
 airplanes+0.097
 chatting+0.095
Offic+0.095
 travelling+0.091
Neg logit contributions
 am-0.066
 loser-0.057
ses-0.054
kin-0.053
 hurts-0.053
Token tightly
Token ID17707
Feature activation+1.05e-02

Pos logit contributions
years+0.058
Offic+0.056
 airplanes+0.054
 chatting+0.054
 libraries+0.052
Neg logit contributions
 am-0.039
 loser-0.034
kin-0.033
 hurts-0.033
ses-0.033
Token and
Token ID290
Feature activation+1.21e-01

Pos logit contributions
years+0.023
 airplanes+0.022
 chatting+0.022
Offic+0.022
 travelling+0.021
Neg logit contributions
 am-0.015
 loser-0.013
ses-0.012
kin-0.012
 hurts-0.012
Token looked
Token ID3114
Feature activation+3.59e-02

Pos logit contributions
years+0.224
Offic+0.197
 airplanes+0.197
 chatting+0.193
 libraries+0.190
Neg logit contributions
 am-0.219
ses-0.192
 hurts-0.190
kin-0.184
ger-0.182
Token up
Token ID510
Feature activation-3.60e-03

Pos logit contributions
Offic+0.076
years+0.076
 airplanes+0.071
 chatting+0.070
 terminal+0.069
Neg logit contributions
 am-0.052
 hurts-0.042
 loser-0.042
kin-0.042
ger-0.041
Token.
Token ID13
Feature activation+7.04e-03

Pos logit contributions
 am+0.006
 loser+0.006
kin+0.005
 hurts+0.005
ses+0.005
Neg logit contributions
years-0.007
Offic-0.006
 airplanes-0.006
 chatting-0.006
 terminal-0.006
Token He
Token ID679
Feature activation+1.82e-02

Pos logit contributions
years+0.014
 airplanes+0.013
 chatting+0.013
Offic+0.013
 travelling+0.012
Neg logit contributions
 am-0.012
 loser-0.010
ses-0.010
kin-0.010
 hurts-0.010
Token saw
Token ID2497
Feature activation+4.88e-02

Pos logit contributions
years+0.041
 airplanes+0.039
 chatting+0.039
Offic+0.038
 libraries+0.038
Neg logit contributions
 am-0.025
 hurts-0.019
ses-0.018
ger-0.018
'm-0.018
TokenThey
Token ID2990
Feature activation+1.79e-02

Pos logit contributions
years+0.020
 airplanes+0.019
 chatting+0.018
Offic+0.018
 travelling+0.017
Neg logit contributions
 am-0.028
 loser-0.025
kin-0.024
ses-0.024
 hurts-0.024
Token put
Token ID1234
Feature activation+7.82e-02

Pos logit contributions
years+0.037
 airplanes+0.035
 chatting+0.035
Offic+0.035
 travelling+0.033
Neg logit contributions
 am-0.028
 loser-0.024
ses-0.023
kin-0.023
 hurts-0.023
Token on
Token ID319
Feature activation+6.62e-03

Pos logit contributions
 airplanes+0.121
years+0.121
 chatting+0.115
Offic+0.109
 inviting+0.108
Neg logit contributions
 am-0.156
 loser-0.143
ses-0.136
 hurts-0.134
 fault-0.133
Token their
Token ID511
Feature activation+4.59e-02

Pos logit contributions
years+0.010
 airplanes+0.010
 chatting+0.010
Offic+0.010
 inviting+0.009
Neg logit contributions
 am-0.013
 loser-0.012
ses-0.012
kin-0.012
 hurts-0.011
Token coats
Token ID30720
Feature activation+5.54e-02

Pos logit contributions
years+0.067
 airplanes+0.062
 chatting+0.061
Offic+0.061
 travelling+0.056
Neg logit contributions
 am-0.100
 loser-0.090
ses-0.088
 hurts-0.088
kin-0.087
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
years+0.124
Offic+0.114
 chatting+0.111
 airplanes+0.110
 inviting+0.107
Neg logit contributions
 am-0.081
kin-0.067
 loser-0.067
ses-0.066
ro-0.065
Token gloves
Token ID18051
Feature activation+4.91e-02

Pos logit contributions
years+0.171
 airplanes+0.163
 chatting+0.159
Offic+0.158
 travelling+0.147
Neg logit contributions
 am-0.259
 loser-0.238
ses-0.229
kin-0.228
 hurts-0.227
Token and
Token ID290
Feature activation+1.12e-01

Pos logit contributions
years+0.094
 airplanes+0.089
 chatting+0.088
Offic+0.088
 travelling+0.083
Neg logit contributions
 am-0.084
 loser-0.074
ses-0.071
kin-0.071
 hurts-0.070
Token run
Token ID1057
Feature activation+4.91e-02

Pos logit contributions
years+0.187
 airplanes+0.179
 chatting+0.176
Offic+0.175
 travelling+0.165
Neg logit contributions
 am-0.214
 loser-0.194
ses-0.187
kin-0.185
 hurts-0.185
Token outside
Token ID2354
Feature activation-6.26e-03

Pos logit contributions
years+0.088
 airplanes+0.086
 chatting+0.084
Offic+0.081
 travelling+0.079
Neg logit contributions
 am-0.087
 loser-0.079
ses-0.077
kin-0.077
 hurts-0.075
Token.
Token ID13
Feature activation+2.41e-03

Pos logit contributions
 am+0.008
 loser+0.007
ses+0.007
kin+0.006
 hurts+0.006
Neg logit contributions
years-0.014
 airplanes-0.014
 chatting-0.014
Offic-0.013
 travelling-0.013
Token Lily
Token ID20037
Feature activation-1.30e-02

Pos logit contributions
years+0.057
 chatting+0.055
 airplanes+0.054
 inviting+0.052
Offic+0.052
Neg logit contributions
 am-0.063
 loser-0.058
ses-0.057
 fault-0.055
 hurts-0.055
Token took
Token ID1718
Feature activation+7.32e-02

Pos logit contributions
 am+0.024
 loser+0.021
kin+0.021
 hurts+0.021
ses+0.021
Neg logit contributions
years-0.023
 airplanes-0.021
 chatting-0.021
Offic-0.021
 travelling-0.020
Token out
Token ID503
Feature activation+1.74e-03

Pos logit contributions
years+0.135
 airplanes+0.128
 chatting+0.126
Offic+0.125
 travelling+0.119
Neg logit contributions
 am-0.131
 loser-0.117
ses-0.114
kin-0.112
 hurts-0.111
Token the
Token ID262
Feature activation+1.13e-02

Pos logit contributions
years+0.004
Offic+0.004
 chatting+0.004
 terminal+0.004
 travelling+0.004
Neg logit contributions
 am-0.002
 thing-0.001
kin-0.001
 hurts-0.001
 loser-0.001
Token plates
Token ID13375
Feature activation+4.22e-02

Pos logit contributions
 airplanes+0.018
years+0.017
 chatting+0.017
 inviting+0.016
 travelling+0.016
Neg logit contributions
 am-0.021
 loser-0.021
ses-0.020
kin-0.019
ger-0.019
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
years+0.080
Offic+0.075
 airplanes+0.075
 chatting+0.075
 travelling+0.071
Neg logit contributions
 am-0.072
 loser-0.064
kin-0.062
ses-0.061
 hurts-0.061
Token cups
Token ID14180
Feature activation+6.44e-02

Pos logit contributions
years+0.218
Offic+0.204
 airplanes+0.199
 chatting+0.198
 travelling+0.193
Neg logit contributions
 am-0.214
kin-0.188
 hurts-0.185
 loser-0.183
ses-0.181
Token from
Token ID422
Feature activation+5.34e-02

Pos logit contributions
years+0.128
 airplanes+0.124
 chatting+0.122
Offic+0.120
 travelling+0.115
Neg logit contributions
 am-0.103
 loser-0.092
ses-0.088
 hurts-0.087
kin-0.085
Token the
Token ID262
Feature activation+2.10e-02

Pos logit contributions
years+0.101
 airplanes+0.096
 chatting+0.095
Offic+0.094
 travelling+0.089
Neg logit contributions
 am-0.094
 loser-0.084
ses-0.080
kin-0.079
 hurts-0.079
Token cup
Token ID6508
Feature activation+3.44e-02

Pos logit contributions
years+0.031
 airplanes+0.030
 chatting+0.029
Offic+0.028
 terminal+0.027
Neg logit contributions
 am-0.044
 loser-0.041
ses-0.040
kin-0.039
 hurts-0.039
Tokenboard
Token ID3526
Feature activation+4.89e-04

Pos logit contributions
years+0.064
 airplanes+0.061
 chatting+0.060
Offic+0.060
 travelling+0.056
Neg logit contributions
 am-0.059
 loser-0.053
ses-0.052
 hurts-0.050
kin-0.050
Token her
Token ID607
Feature activation+2.74e-02

Pos logit contributions
years+0.149
Offic+0.144
 chatting+0.144
 airplanes+0.139
 terminal+0.136
Neg logit contributions
 am-0.154
 hurts-0.136
 loser-0.134
ses-0.131
kin-0.129
Token grandma
Token ID49890
Feature activation+5.82e-02

Pos logit contributions
years+0.052
 chatting+0.049
Offic+0.049
 airplanes+0.048
 travelling+0.046
Neg logit contributions
 am-0.048
 loser-0.041
 hurts-0.040
ses-0.040
kin-0.040
Token a
Token ID257
Feature activation+3.36e-02

Pos logit contributions
years+0.130
Offic+0.124
 chatting+0.123
 airplanes+0.123
 travelling+0.118
Neg logit contributions
 am-0.079
 loser-0.067
kin-0.064
 hurts-0.063
ses-0.063
Token big
Token ID1263
Feature activation+4.65e-02

Pos logit contributions
years+0.069
 airplanes+0.065
 chatting+0.064
Offic+0.064
 travelling+0.061
Neg logit contributions
 am-0.054
 loser-0.047
ses-0.045
kin-0.044
 hurts-0.044
Token hug
Token ID16225
Feature activation+4.68e-02

Pos logit contributions
years+0.105
 airplanes+0.099
 chatting+0.099
Offic+0.098
 travelling+0.094
Neg logit contributions
 am-0.064
 loser-0.055
ses-0.053
 hurts-0.052
kin-0.052
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
years+0.081
 airplanes+0.079
 chatting+0.078
Offic+0.076
 travelling+0.073
Neg logit contributions
 am-0.086
 loser-0.078
ses-0.075
 hurts-0.074
kin-0.073
Token shared
Token ID4888
Feature activation+4.84e-02

Pos logit contributions
years+0.220
 airplanes+0.197
 chatting+0.196
Offic+0.195
 cities+0.189
Neg logit contributions
 am-0.222
ses-0.191
 hurts-0.191
kin-0.188
 loser-0.187
Token her
Token ID607
Feature activation+3.65e-02

Pos logit contributions
years+0.094
Offic+0.087
 terminal+0.085
 airplanes+0.084
 chatting+0.084
Neg logit contributions
 am-0.083
 hurts-0.069
kin-0.069
 loser-0.067
 thing-0.066
Token cookies
Token ID14746
Feature activation+3.34e-02

Pos logit contributions
years+0.060
 airplanes+0.060
 chatting+0.059
Offic+0.057
 travelling+0.055
Neg logit contributions
 am-0.069
 loser-0.063
ses-0.060
 hurts-0.059
kin-0.059
Token with
Token ID351
Feature activation+4.74e-02

Pos logit contributions
years+0.066
 airplanes+0.064
 chatting+0.064
Offic+0.062
 travelling+0.060
Neg logit contributions
 am-0.053
 loser-0.047
ses-0.046
 hurts-0.045
kin-0.044
Token her
Token ID607
Feature activation+2.86e-02

Pos logit contributions
years+0.086
Offic+0.085
 chatting+0.082
 cities+0.079
 terminal+0.079
Neg logit contributions
 am-0.080
 hurts-0.068
 loser-0.067
 fault-0.063
kin-0.062
Token Jack
Token ID3619
Feature activation+7.36e-03

Pos logit contributions
years+0.167
 airplanes+0.159
 chatting+0.158
Offic+0.157
 travelling+0.149
Neg logit contributions
 am-0.131
 loser-0.116
ses-0.111
kin-0.110
 hurts-0.109
Token put
Token ID1234
Feature activation+7.35e-02

Pos logit contributions
years+0.014
Offic+0.014
 chatting+0.014
 airplanes+0.014
 travelling+0.013
Neg logit contributions
 am-0.013
kin-0.010
 loser-0.010
 hurts-0.010
ses-0.010
Token on
Token ID319
Feature activation+3.84e-03

Pos logit contributions
years+0.146
 airplanes+0.138
 chatting+0.138
Offic+0.137
 travelling+0.130
Neg logit contributions
 am-0.121
 loser-0.106
kin-0.102
ses-0.102
 hurts-0.101
Token their
Token ID511
Feature activation+3.85e-02

Pos logit contributions
years+0.007
Offic+0.007
 chatting+0.007
 airplanes+0.007
 travelling+0.006
Neg logit contributions
 am-0.007
 loser-0.006
ses-0.006
 hurts-0.006
kin-0.006
Token coats
Token ID30720
Feature activation+5.39e-02

Pos logit contributions
years+0.073
 chatting+0.067
Offic+0.066
 airplanes+0.062
 travelling+0.060
Neg logit contributions
 am-0.068
ses-0.061
 hurts-0.060
kin-0.059
 loser-0.059
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
years+0.109
Offic+0.102
 chatting+0.101
 airplanes+0.100
 travelling+0.096
Neg logit contributions
 am-0.087
 loser-0.076
kin-0.073
ses-0.072
 hurts-0.071
Token hats
Token ID23910
Feature activation+5.26e-02

Pos logit contributions
years+0.224
 chatting+0.205
Offic+0.197
 travelling+0.192
 airplanes+0.190
Neg logit contributions
 am-0.216
ses-0.190
kin-0.190
 hurts-0.185
 loser-0.181
Token and
Token ID290
Feature activation+1.14e-01

Pos logit contributions
years+0.121
Offic+0.115
 chatting+0.111
 airplanes+0.110
 inviting+0.107
Neg logit contributions
 am-0.071
 loser-0.060
kin-0.059
ses-0.057
 hurts-0.056
Token went
Token ID1816
Feature activation+4.56e-02

Pos logit contributions
years+0.210
 chatting+0.192
Offic+0.182
 travelling+0.179
 inviting+0.175
Neg logit contributions
 am-0.207
ses-0.182
kin-0.179
 hurts-0.178
 loser-0.175
Token outside
Token ID2354
Feature activation-1.46e-02

Pos logit contributions
years+0.117
Offic+0.114
 chatting+0.105
David+0.103
 inviting+0.103
Neg logit contributions
 am-0.049
 hurts-0.034
ro-0.033
 loser-0.033
 thing-0.032
Token with
Token ID351
Feature activation+2.55e-02

Pos logit contributions
 am+0.016
 loser+0.013
 hurts+0.011
kin+0.011
ses+0.011
Neg logit contributions
years-0.037
Offic-0.036
 airplanes-0.036
 chatting-0.035
 terminal-0.034
Token She
Token ID1375
Feature activation+3.04e-02

Pos logit contributions
years+0.021
 airplanes+0.020
 chatting+0.020
Offic+0.020
 travelling+0.019
Neg logit contributions
 am-0.015
 loser-0.013
ses-0.012
kin-0.012
 hurts-0.012
Token gave
Token ID2921
Feature activation+8.58e-02

Pos logit contributions
years+0.057
 airplanes+0.056
 cities+0.056
 chatting+0.055
 libraries+0.054
Neg logit contributions
 am-0.050
kin-0.045
ger-0.041
 hurts-0.041
ses-0.040
Token James
Token ID3700
Feature activation+1.03e-02

Pos logit contributions
years+0.141
 chatting+0.134
Offic+0.133
 airplanes+0.132
 travelling+0.124
Neg logit contributions
 am-0.166
 loser-0.149
 hurts-0.146
kin-0.143
ses-0.142
Token a
Token ID257
Feature activation+3.39e-02

Pos logit contributions
years+0.022
 airplanes+0.021
 chatting+0.021
Offic+0.020
 travelling+0.019
Neg logit contributions
 am-0.016
 loser-0.014
ses-0.013
kin-0.013
 hurts-0.013
Token hug
Token ID16225
Feature activation+4.80e-02

Pos logit contributions
years+0.062
 airplanes+0.060
 chatting+0.059
Offic+0.059
 travelling+0.056
Neg logit contributions
 am-0.058
 loser-0.052
ses-0.050
kin-0.049
 hurts-0.049
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
years+0.087
 airplanes+0.082
 chatting+0.081
Offic+0.081
 travelling+0.077
Neg logit contributions
 am-0.086
 loser-0.077
ses-0.074
kin-0.074
 hurts-0.073
Token put
Token ID1234
Feature activation+8.27e-02

Pos logit contributions
years+0.217
 airplanes+0.197
 chatting+0.196
Offic+0.195
 cities+0.187
Neg logit contributions
 am-0.220
 hurts-0.189
kin-0.188
 loser-0.187
ses-0.186
Token her
Token ID607
Feature activation+3.20e-02

Pos logit contributions
years+0.157
 airplanes+0.149
 chatting+0.148
Offic+0.147
 travelling+0.139
Neg logit contributions
 am-0.142
 loser-0.126
ses-0.121
kin-0.120
 hurts-0.120
Token phone
Token ID3072
Feature activation+3.29e-02

Pos logit contributions
years+0.062
 airplanes+0.059
 chatting+0.058
Offic+0.057
 travelling+0.055
Neg logit contributions
 am-0.054
 loser-0.049
ses-0.047
kin-0.046
 hurts-0.046
Token back
Token ID736
Feature activation+2.23e-02

Pos logit contributions
years+0.074
Offic+0.072
 chatting+0.071
 airplanes+0.069
 terminal+0.067
Neg logit contributions
 am-0.044
 loser-0.037
kin-0.036
 hurts-0.036
ger-0.034
Token in
Token ID287
Feature activation+1.35e-02

Pos logit contributions
years+0.049
Offic+0.048
 airplanes+0.047
 chatting+0.046
 terminal+0.045
Neg logit contributions
 am-0.031
 loser-0.026
kin-0.025
 hurts-0.025
ses-0.024
Token pat
Token ID1458
Feature activation+4.36e-02

Pos logit contributions
 airplanes+0.017
Offic+0.016
years+0.016
 chatting+0.016
 travelling+0.016
Neg logit contributions
 am-0.012
 loser-0.011
kin-0.011
ses-0.010
ger-0.010
Token?"
Token ID1701
Feature activation+1.19e-02

Pos logit contributions
 airplanes+0.086
years+0.085
Offic+0.084
 chatting+0.081
 travelling+0.078
Neg logit contributions
 am-0.062
ses-0.061
 loser-0.057
 fault-0.056
kin-0.055
Token He
Token ID679
Feature activation+2.65e-02

Pos logit contributions
years+0.021
Offic+0.020
 airplanes+0.020
 chatting+0.020
 cities+0.019
Neg logit contributions
 am-0.021
ses-0.019
 loser-0.019
kin-0.019
 hurts-0.019
Token showed
Token ID3751
Feature activation+7.55e-02

Pos logit contributions
years+0.052
 airplanes+0.049
 chatting+0.049
 cities+0.047
irs+0.046
Neg logit contributions
 am-0.044
 hurts-0.038
ger-0.037
ses-0.037
kin-0.037
Token Ben
Token ID3932
Feature activation+1.03e-02

Pos logit contributions
years+0.095
 airplanes+0.089
Offic+0.085
 chatting+0.083
 cities+0.079
Neg logit contributions
 am-0.179
 loser-0.164
ses-0.160
kin-0.159
 hurts-0.157
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
years+0.018
 airplanes+0.017
 chatting+0.017
Offic+0.016
 travelling+0.016
Neg logit contributions
 am-0.019
 loser-0.018
ses-0.017
kin-0.017
 hurts-0.016
Token Lily
Token ID20037
Feature activation+1.73e-02

Pos logit contributions
years+0.263
 airplanes+0.252
 chatting+0.250
Offic+0.249
 travelling+0.238
Neg logit contributions
 am-0.169
 loser-0.146
ses-0.139
kin-0.138
 hurts-0.137
Token how
Token ID703
Feature activation-2.88e-02

Pos logit contributions
years+0.039
Offic+0.037
 airplanes+0.036
 chatting+0.036
 travelling+0.034
Neg logit contributions
 am-0.024
 hurts-0.020
 loser-0.020
ses-0.019
kin-0.019
Token to
Token ID284
Feature activation+6.14e-05

Pos logit contributions
 loser+0.055
 am+0.054
ses+0.052
kin+0.050
 hurts+0.047
Neg logit contributions
years-0.045
 airplanes-0.043
 chatting-0.043
 travelling-0.041
Offic-0.040
Token pat
Token ID1458
Feature activation+6.13e-02

Pos logit contributions
Offic+0.000
 airplanes+0.000
 chatting+0.000
years+0.000
 inviting+0.000
Neg logit contributions
 am-0.000
 loser-0.000
ses-0.000
kin-0.000
 hurts-0.000
Token the
Token ID262
Feature activation+1.23e-02

Pos logit contributions
years+0.088
 airplanes+0.085
Offic+0.083
 chatting+0.082
 travelling+0.077
Neg logit contributions
 am-0.130
 loser-0.120
ses-0.117
kin-0.117
 hurts-0.114
Token mom
Token ID1995
Feature activation+5.44e-02

Pos logit contributions
 airplanes+0.045
 chatting+0.044
years+0.043
 inviting+0.042
 travelling+0.041
Neg logit contributions
 am-0.044
 loser-0.040
ger-0.037
ses-0.037
kin-0.036
Token handed
Token ID10158
Feature activation+6.66e-02

Pos logit contributions
years+0.105
 cities+0.101
 terminal+0.100
 airplanes+0.099
 libraries+0.098
Neg logit contributions
 am-0.087
kin-0.074
ger-0.071
 hurts-0.071
chy-0.070
Token her
Token ID607
Feature activation+2.66e-02

Pos logit contributions
years+0.101
 airplanes+0.095
Offic+0.094
 chatting+0.094
 travelling+0.087
Neg logit contributions
 am-0.141
 loser-0.126
 hurts-0.123
ses-0.122
kin-0.122
Token some
Token ID617
Feature activation+3.29e-02

Pos logit contributions
 airplanes+0.046
years+0.044
 chatting+0.043
 travelling+0.041
 inviting+0.041
Neg logit contributions
 am-0.048
 loser-0.048
ses-0.045
kin-0.042
 hurts-0.041
Token soap
Token ID19533
Feature activation+8.48e-03

Pos logit contributions
 airplanes+0.060
years+0.058
 chatting+0.056
 inviting+0.056
 travelling+0.054
Neg logit contributions
 am-0.056
 loser-0.054
ses-0.052
kin-0.048
 fault-0.047
Token and
Token ID290
Feature activation+1.20e-01

Pos logit contributions
 airplanes+0.011
 chatting+0.011
years+0.011
 travelling+0.010
 inviting+0.010
Neg logit contributions
 am-0.018
 loser-0.017
ses-0.017
 hurts-0.016
kin-0.016
Token said
Token ID531
Feature activation+6.10e-02

Pos logit contributions
years+0.228
Offic+0.212
 chatting+0.212
 airplanes+0.211
 travelling+0.201
Neg logit contributions
 am-0.207
 loser-0.180
 hurts-0.177
kin-0.176
ses-0.175
Token,
Token ID11
Feature activation+5.42e-02

Pos logit contributions
years+0.119
 chatting+0.114
 airplanes+0.112
 inviting+0.108
 travelling+0.106
Neg logit contributions
 am-0.098
ses-0.089
 loser-0.089
 hurts-0.083
ger-0.081
Token "
Token ID366
Feature activation+2.99e-02

Pos logit contributions
years+0.128
 airplanes+0.124
Offic+0.121
 terminal+0.120
 cities+0.117
Neg logit contributions
 am-0.066
 loser-0.054
ses-0.053
kin-0.051
 hurts-0.049
TokenMake
Token ID12050
Feature activation+1.96e-03

Pos logit contributions
years+0.060
 airplanes+0.060
 chatting+0.059
 travelling+0.057
Offic+0.056
Neg logit contributions
 am-0.047
 loser-0.041
kin-0.037
 hurts-0.037
ro-0.037
Token sure
Token ID1654
Feature activation-1.19e-02

Pos logit contributions
years+0.003
 airplanes+0.003
 chatting+0.003
Offic+0.003
 inviting+0.003
Neg logit contributions
 am-0.004
 loser-0.004
ses-0.004
kin-0.003
 hurts-0.003
Token.
Token ID13
Feature activation+3.72e-03

Pos logit contributions
 am+0.027
 loser+0.024
ses+0.023
 hurts+0.022
kin+0.022
Neg logit contributions
 airplanes-0.042
 chatting-0.042
years-0.040
 travelling-0.040
 cities-0.039
Token Amanda
Token ID23040
Feature activation+4.21e-02

Pos logit contributions
years+0.007
 airplanes+0.007
 chatting+0.007
Offic+0.007
 travelling+0.007
Neg logit contributions
 am-0.006
 loser-0.005
ses-0.005
 hurts-0.005
kin-0.005
Token thanked
Token ID26280
Feature activation+8.23e-02

Pos logit contributions
years+0.090
Offic+0.086
 airplanes+0.086
 chatting+0.086
 travelling+0.082
Neg logit contributions
 am-0.062
 loser-0.052
 hurts-0.050
ses-0.050
kin-0.050
Token her
Token ID607
Feature activation+2.70e-02

Pos logit contributions
years+0.192
 chatting+0.171
Offic+0.170
aris+0.168
eping+0.167
Neg logit contributions
 am-0.105
 thing-0.081
 hurts-0.075
kin-0.074
 happen-0.074
Token mom
Token ID1995
Feature activation+6.18e-02

Pos logit contributions
years+0.067
Offic+0.063
 chatting+0.063
 airplanes+0.062
 travelling+0.060
Neg logit contributions
 am-0.032
 loser-0.026
kin-0.026
ses-0.026
 hurts-0.024
Token and
Token ID290
Feature activation+1.19e-01

Pos logit contributions
years+0.128
Offic+0.123
 airplanes+0.121
 chatting+0.119
 terminal+0.115
Neg logit contributions
 am-0.093
 loser-0.080
kin-0.077
 hurts-0.077
ses-0.076
Token then
Token ID788
Feature activation+3.50e-02

Pos logit contributions
years+0.239
Offic+0.208
 libraries+0.203
 cities+0.200
 chatting+0.199
Neg logit contributions
 am-0.197
kin-0.168
 hurts-0.159
ger-0.158
ses-0.155
Token resumed
Token ID28291
Feature activation+3.63e-02

Pos logit contributions
years+0.064
Offic+0.058
 airplanes+0.055
 chatting+0.054
 libraries+0.053
Neg logit contributions
 am-0.067
 hurts-0.056
kin-0.055
ger-0.054
 loser-0.053
Token her
Token ID607
Feature activation+1.19e-02

Pos logit contributions
years+0.072
Offic+0.068
ents+0.064
 terminal+0.063
 airplanes+0.062
Neg logit contributions
 am-0.062
 loser-0.049
 thing-0.049
kin-0.048
 hurts-0.048
Token racing
Token ID11717
Feature activation+4.54e-02

Pos logit contributions
years+0.023
Offic+0.021
 chatting+0.020
 airplanes+0.020
 terminal+0.019
Neg logit contributions
 am-0.021
 loser-0.019
 hurts-0.018
ses-0.018
kin-0.017
Token around
Token ID1088
Feature activation+9.76e-03

Pos logit contributions
years+0.102
Offic+0.093
 chatting+0.091
irs+0.084
 inviting+0.084
Neg logit contributions
 am-0.067
ger-0.056
 loser-0.056
 thing-0.055
kin-0.053
Token They
Token ID1119
Feature activation+2.30e-02

Pos logit contributions
years+0.005
 airplanes+0.005
 chatting+0.005
Offic+0.005
 travelling+0.005
Neg logit contributions
 am-0.004
 loser-0.003
ses-0.003
kin-0.003
 hurts-0.003
Token put
Token ID1234
Feature activation+8.07e-02

Pos logit contributions
years+0.052
 chatting+0.047
 airplanes+0.047
Offic+0.047
 travelling+0.044
Neg logit contributions
 am-0.035
 loser-0.027
ses-0.026
kin-0.026
 hurts-0.026
Token on
Token ID319
Feature activation+8.68e-03

Pos logit contributions
years+0.166
Offic+0.156
 chatting+0.155
 airplanes+0.152
 terminal+0.147
Neg logit contributions
 am-0.128
 loser-0.110
ses-0.108
kin-0.108
 hurts-0.106
Token their
Token ID511
Feature activation+5.10e-02

Pos logit contributions
years+0.017
Offic+0.016
 chatting+0.016
 airplanes+0.016
 travelling+0.015
Neg logit contributions
 am-0.014
 loser-0.012
ses-0.012
 hurts-0.012
kin-0.011
Token coats
Token ID30720
Feature activation+5.57e-02

Pos logit contributions
years+0.079
Offic+0.074
 chatting+0.072
 airplanes+0.070
 terminal+0.066
Neg logit contributions
 am-0.108
ses-0.096
 hurts-0.096
 loser-0.095
kin-0.092
Token and
Token ID290
Feature activation+1.19e-01

Pos logit contributions
years+0.120
Offic+0.111
 chatting+0.109
 airplanes+0.108
 travelling+0.104
Neg logit contributions
 am-0.084
 loser-0.071
kin-0.071
ses-0.068
ro-0.068
Token hats
Token ID23910
Feature activation+5.50e-02

Pos logit contributions
years+0.200
 chatting+0.178
Offic+0.172
 airplanes+0.170
 travelling+0.163
Neg logit contributions
 am-0.245
 hurts-0.214
ses-0.214
 loser-0.211
kin-0.207
Token and
Token ID290
Feature activation+1.13e-01

Pos logit contributions
years+0.123
Offic+0.115
 chatting+0.112
 airplanes+0.110
 inviting+0.106
Neg logit contributions
 am-0.080
 loser-0.067
kin-0.066
ses-0.064
 hurts-0.064
Token go
Token ID467
Feature activation+4.17e-02

Pos logit contributions
years+0.201
 chatting+0.179
Offic+0.172
 airplanes+0.168
 travelling+0.164
Neg logit contributions
 am-0.222
 hurts-0.191
ses-0.190
 loser-0.188
kin-0.182
Token to
Token ID284
Feature activation+2.10e-02

Pos logit contributions
years+0.101
Offic+0.096
 chatting+0.091
 airplanes+0.089
David+0.088
Neg logit contributions
 am-0.055
ger-0.039
 loser-0.038
ro-0.038
 hurts-0.037
Token the
Token ID262
Feature activation+1.00e-02

Pos logit contributions
years+0.048
 chatting+0.044
Offic+0.044
 airplanes+0.044
 terminal+0.041
Neg logit contributions
 am-0.030
 loser-0.023
ses-0.022
 hurts-0.022
kin-0.021
Token.
Token ID13
Feature activation-3.03e-03

Pos logit contributions
years+0.010
Offic+0.009
 airplanes+0.009
 chatting+0.009
 terminal+0.009
Neg logit contributions
 am-0.007
 loser-0.005
 hurts-0.005
ger-0.005
 thing-0.005
Token He
Token ID679
Feature activation+1.40e-02

Pos logit contributions
 am+0.004
 loser+0.004
kin+0.004
ses+0.004
 hurts+0.004
Neg logit contributions
years-0.007
 airplanes-0.006
Offic-0.006
 chatting-0.006
 travelling-0.006
Token thanked
Token ID26280
Feature activation+7.61e-02

Pos logit contributions
years+0.026
 airplanes+0.024
 chatting+0.024
Offic+0.024
 cities+0.024
Neg logit contributions
 am-0.024
 hurts-0.020
kin-0.020
ses-0.020
 loser-0.020
Token his
Token ID465
Feature activation+2.61e-02

Pos logit contributions
years+0.186
 chatting+0.176
 terminal+0.168
 airplanes+0.167
Offic+0.165
Neg logit contributions
 am-0.091
 thing-0.070
 loser-0.067
kin-0.067
 hurts-0.063
Token Mama
Token ID40084
Feature activation+6.32e-02

Pos logit contributions
years+0.054
 chatting+0.051
 airplanes+0.050
Offic+0.050
 terminal+0.048
Neg logit contributions
 am-0.041
 loser-0.036
ses-0.035
kin-0.034
 hurts-0.034
Token and
Token ID290
Feature activation+1.19e-01

Pos logit contributions
years+0.129
Offic+0.121
 airplanes+0.120
 chatting+0.119
 terminal+0.115
Neg logit contributions
 am-0.098
 loser-0.085
 hurts-0.082
kin-0.082
ses-0.081
Token got
Token ID1392
Feature activation+5.09e-02

Pos logit contributions
years+0.223
 airplanes+0.202
 chatting+0.199
 cities+0.197
 libraries+0.197
Neg logit contributions
 am-0.211
 hurts-0.176
kin-0.175
ses-0.175
 loser-0.167
Token ready
Token ID3492
Feature activation+7.22e-03

Pos logit contributions
years+0.102
Offic+0.100
 chatting+0.097
 airplanes+0.097
 cities+0.093
Neg logit contributions
 am-0.081
 hurts-0.067
 loser-0.067
kin-0.063
ses-0.062
Token to
Token ID284
Feature activation+7.69e-03

Pos logit contributions
years+0.016
Offic+0.016
 airplanes+0.015
 chatting+0.015
 terminal+0.015
Neg logit contributions
 am-0.010
kin-0.008
 loser-0.008
 hurts-0.008
ses-0.007
Token play
Token ID711
Feature activation+5.34e-03

Pos logit contributions
years+0.017
 chatting+0.016
Offic+0.016
 airplanes+0.016
 cities+0.015
Neg logit contributions
 am-0.011
 loser-0.009
 hurts-0.009
kin-0.009
ses-0.009
Token with
Token ID351
Feature activation+1.27e-02

Pos logit contributions
years+0.012
Offic+0.011
 chatting+0.011
 airplanes+0.011
 terminal+0.011
Neg logit contributions
 am-0.008
 loser-0.006
kin-0.006
ger-0.006
 hurts-0.006
Token.
Token ID13
Feature activation-2.88e-02

Pos logit contributions
 am+0.044
 loser+0.040
ses+0.038
kin+0.037
 hurts+0.036
Neg logit contributions
years-0.061
 airplanes-0.060
 chatting-0.059
Offic-0.058
 travelling-0.056
Token "
Token ID366
Feature activation+2.87e-03

Pos logit contributions
 am+0.044
 loser+0.038
ses+0.037
kin+0.036
 hurts+0.036
Neg logit contributions
years-0.059
 airplanes-0.057
 chatting-0.056
Offic-0.056
 travelling-0.053
TokenOh
Token ID5812
Feature activation-9.03e-02

Pos logit contributions
years+0.006
 airplanes+0.005
 chatting+0.005
Offic+0.005
 travelling+0.005
Neg logit contributions
 am-0.005
 loser-0.004
kin-0.004
ses-0.004
 hurts-0.004
Token no
Token ID645
Feature activation-3.89e-02

Pos logit contributions
 am+0.150
 loser+0.138
ro+0.138
ger+0.138
ses+0.136
Neg logit contributions
years-0.160
 chatting-0.153
 airplanes-0.150
 inviting-0.145
 travelling-0.145
Token!
Token ID0
Feature activation-6.40e-02

Pos logit contributions
 am+0.058
ses+0.052
kin+0.051
 loser+0.049
 hurts+0.046
Neg logit contributions
years-0.078
 chatting-0.077
 airplanes-0.076
Offic-0.075
 travelling-0.075
Token Where
Token ID6350
Feature activation-1.42e-01

Pos logit contributions
 am+0.091
kin+0.083
ses+0.082
ger+0.082
 loser+0.076
Neg logit contributions
years-0.132
Offic-0.125
 chatting-0.116
ents-0.114
 libraries-0.113
Token is
Token ID318
Feature activation-5.48e-02

Pos logit contributions
ses+0.197
 loser+0.196
 am+0.191
ger+0.184
ro+0.181
Neg logit contributions
years-0.275
 airplanes-0.273
 chatting-0.256
Offic-0.254
 libraries-0.253
Token my
Token ID616
Feature activation-5.08e-02

Pos logit contributions
 am+0.090
 loser+0.080
ses+0.077
kin+0.076
 hurts+0.075
Neg logit contributions
years-0.108
 airplanes-0.103
 chatting-0.102
Offic-0.100
 travelling-0.097
Token purse
Token ID27517
Feature activation-3.80e-02

Pos logit contributions
 am+0.098
 loser+0.089
kin+0.087
 hurts+0.085
ses+0.084
Neg logit contributions
 airplanes-0.078
years-0.076
 chatting-0.074
Offic-0.073
 travelling-0.072
Token?"
Token ID1701
Feature activation-9.04e-03

Pos logit contributions
 am+0.051
ses+0.047
 loser+0.047
kin+0.043
 fault+0.041
Neg logit contributions
 airplanes-0.081
years-0.079
 chatting-0.077
Offic-0.076
 travelling-0.075
Token she
Token ID673
Feature activation-1.20e-03

Pos logit contributions
 am+0.016
 loser+0.014
ses+0.013
kin+0.013
 hurts+0.013
Neg logit contributions
years-0.017
 airplanes-0.016
 chatting-0.016
Offic-0.016
 travelling-0.015
Token box
Token ID3091
Feature activation-4.28e-02

Pos logit contributions
 am+0.093
 loser+0.087
kin+0.086
ses+0.082
adia+0.081
Neg logit contributions
 airplanes-0.113
 chatting-0.103
years-0.100
 travelling-0.097
 terminal-0.094
Token.
Token ID13
Feature activation-3.92e-02

Pos logit contributions
 am+0.063
 loser+0.057
ses+0.054
kin+0.052
 hurts+0.051
Neg logit contributions
 airplanes-0.087
years-0.087
 chatting-0.085
 travelling-0.081
Offic-0.080
Token
Token ID198
Feature activation-1.64e-02

Pos logit contributions
 am+0.057
 loser+0.050
ses+0.048
kin+0.047
 hurts+0.047
Neg logit contributions
years-0.084
 airplanes-0.080
 chatting-0.080
Offic-0.079
 travelling-0.076
Token
Token ID198
Feature activation-1.60e-02

Pos logit contributions
 am+0.022
 loser+0.017
 hurts+0.016
kin+0.016
ses+0.016
Neg logit contributions
years-0.038
 airplanes-0.037
 chatting-0.037
Offic-0.036
 travelling-0.035
Token"
Token ID1
Feature activation-3.15e-03

Pos logit contributions
 am+0.028
 loser+0.025
kin+0.023
ger+0.022
 hurts+0.022
Neg logit contributions
 airplanes-0.030
years-0.029
 chatting-0.029
 travelling-0.027
 cities-0.026
TokenWhere
Token ID8496
Feature activation-1.41e-01

Pos logit contributions
 am+0.006
 loser+0.005
ses+0.005
kin+0.005
 hurts+0.005
Neg logit contributions
years-0.006
 airplanes-0.005
 chatting-0.005
Offic-0.005
 travelling-0.005
Token is
Token ID318
Feature activation-6.65e-02

Pos logit contributions
 loser+0.195
ro+0.195
ses+0.194
ger+0.188
 am+0.175
Neg logit contributions
 airplanes-0.260
years-0.251
 libraries-0.247
 talked-0.243
 chatting-0.242
Token my
Token ID616
Feature activation-5.77e-02

Pos logit contributions
 am+0.101
 loser+0.089
ses+0.085
 hurts+0.084
kin+0.084
Neg logit contributions
years-0.139
 airplanes-0.131
 chatting-0.131
Offic-0.130
 travelling-0.124
Token hat
Token ID6877
Feature activation-2.40e-02

Pos logit contributions
 am+0.099
 loser+0.086
kin+0.083
 hurts+0.083
ger+0.080
Neg logit contributions
 airplanes-0.105
years-0.100
 chatting-0.099
Offic-0.098
 travelling-0.098
Token and
Token ID290
Feature activation+4.75e-02

Pos logit contributions
 am+0.031
ses+0.028
 loser+0.028
kin+0.026
ger+0.025
Neg logit contributions
 airplanes-0.052
years-0.050
 chatting-0.050
Offic-0.049
 travelling-0.048
Token coat
Token ID13209
Feature activation-6.75e-04

Pos logit contributions
years+0.086
 airplanes+0.085
 chatting+0.082
Offic+0.082
 travelling+0.078
Neg logit contributions
 am-0.083
 loser-0.076
ses-0.071
kin-0.071
 hurts-0.070
Token her
Token ID607
Feature activation-3.40e-02

Pos logit contributions
years+0.057
 airplanes+0.054
 chatting+0.053
Offic+0.053
 inviting+0.050
Neg logit contributions
 am-0.068
 loser-0.063
ses-0.060
kin-0.060
 hurts-0.058
Token mom
Token ID1995
Feature activation+7.87e-03

Pos logit contributions
 am+0.061
 loser+0.056
 hurts+0.051
ger+0.051
kin+0.051
Neg logit contributions
 airplanes-0.058
years-0.058
 chatting-0.058
 travelling-0.054
 visiting-0.053
Tokenmy
Token ID1820
Feature activation-4.08e-04

Pos logit contributions
years+0.015
 airplanes+0.014
 chatting+0.014
Offic+0.014
 travelling+0.013
Neg logit contributions
 am-0.014
 loser-0.012
ses-0.012
 hurts-0.012
kin-0.012
Token,
Token ID11
Feature activation+3.51e-02

Pos logit contributions
 am+0.001
 loser+0.001
ses+0.001
kin+0.001
 hurts+0.001
Neg logit contributions
years-0.001
 airplanes-0.001
 chatting-0.001
Offic-0.001
 travelling-0.001
Token "
Token ID366
Feature activation+8.68e-03

Pos logit contributions
years+0.066
 airplanes+0.061
Offic+0.061
 chatting+0.060
 terminal+0.058
Neg logit contributions
 am-0.062
 loser-0.054
kin-0.052
 hurts-0.052
ses-0.051
TokenWhere
Token ID8496
Feature activation-1.41e-01

Pos logit contributions
years+0.016
 airplanes+0.016
 chatting+0.016
 travelling+0.015
Offic+0.015
Neg logit contributions
 am-0.015
 loser-0.013
kin-0.013
 hurts-0.012
ses-0.012
Token are
Token ID389
Feature activation-5.44e-02

Pos logit contributions
ses+0.215
 loser+0.215
ro+0.210
ger+0.208
 am+0.196
Neg logit contributions
 airplanes-0.244
years-0.241
 chatting-0.239
 talked-0.232
 libraries-0.230
Token we
Token ID356
Feature activation-7.19e-02

Pos logit contributions
 am+0.079
 loser+0.069
 hurts+0.065
ses+0.065
kin+0.064
Neg logit contributions
years-0.117
 airplanes-0.112
Offic-0.111
 chatting-0.111
 travelling-0.105
Token going
Token ID1016
Feature activation-4.33e-02

Pos logit contributions
 am+0.166
 loser+0.149
kin+0.142
 hurts+0.142
ses+0.140
Neg logit contributions
years-0.098
Offic-0.094
 airplanes-0.090
 chatting-0.085
 terminal-0.082
Token?"
Token ID1701
Feature activation-8.38e-03

Pos logit contributions
 am+0.064
 loser+0.055
ses+0.052
 hurts+0.052
kin+0.051
Neg logit contributions
years-0.093
 airplanes-0.088
Offic-0.088
 chatting-0.088
 travelling-0.083
Token Her
Token ID2332
Feature activation+1.58e-02

Pos logit contributions
 am+0.013
 loser+0.011
 hurts+0.011
ses+0.011
kin+0.011
Neg logit contributions
years-0.017
 airplanes-0.016
Offic-0.016
 chatting-0.016
 cities-0.015
Token "
Token ID366
Feature activation-4.42e-03

Pos logit contributions
years+0.020
 airplanes+0.019
 chatting+0.019
Offic+0.018
 travelling+0.018
Neg logit contributions
 am-0.020
 loser-0.018
ses-0.017
kin-0.017
 hurts-0.017
TokenLet
Token ID5756
Feature activation-4.21e-02

Pos logit contributions
 am+0.008
 loser+0.007
ses+0.007
kin+0.007
 hurts+0.007
Neg logit contributions
years-0.008
Offic-0.008
 airplanes-0.008
 chatting-0.008
 travelling-0.007
Token's
Token ID338
Feature activation-1.02e-02

Pos logit contributions
ses+0.084
kin+0.082
 am+0.082
 loser+0.082
ro+0.074
Neg logit contributions
 airplanes-0.064
 chatting-0.057
 inviting-0.057
years-0.056
Offic-0.055
Token go
Token ID467
Feature activation+2.01e-03

Pos logit contributions
 am+0.017
 loser+0.017
ses+0.017
kin+0.016
ger+0.016
Neg logit contributions
 airplanes-0.018
 chatting-0.017
 inviting-0.016
years-0.016
Offic-0.016
Token!
Token ID0
Feature activation-5.01e-02

Pos logit contributions
years+0.004
 airplanes+0.004
 chatting+0.004
Offic+0.004
 travelling+0.004
Neg logit contributions
 am-0.003
 loser-0.003
ses-0.003
kin-0.003
 hurts-0.003
Token Where
Token ID6350
Feature activation-1.41e-01

Pos logit contributions
 am+0.066
kin+0.058
ses+0.058
 loser+0.057
 hurts+0.056
Neg logit contributions
years-0.108
Offic-0.107
 airplanes-0.103
 chatting-0.102
 travelling-0.099
Token shall
Token ID2236
Feature activation-7.99e-02

Pos logit contributions
 am+0.166
 loser+0.162
ses+0.158
ger+0.146
kin+0.145
Neg logit contributions
 airplanes-0.309
years-0.303
 chatting-0.290
Offic-0.290
 travelling-0.286
Token we
Token ID356
Feature activation-7.18e-02

Pos logit contributions
 am+0.117
 loser+0.105
ses+0.100
kin+0.100
 hurts+0.095
Neg logit contributions
years-0.168
 airplanes-0.165
 chatting-0.162
Offic-0.160
 travelling-0.155
Token go
Token ID467
Feature activation-1.54e-02

Pos logit contributions
 am+0.114
 loser+0.107
ses+0.105
kin+0.103
 hurts+0.101
Neg logit contributions
years-0.135
 airplanes-0.134
 chatting-0.132
Offic-0.128
 travelling-0.128
Token today
Token ID1909
Feature activation-7.89e-02

Pos logit contributions
 am+0.022
 loser+0.018
 hurts+0.017
kin+0.016
ses+0.016
Neg logit contributions
years-0.035
Offic-0.033
 chatting-0.032
 airplanes-0.032
 terminal-0.031
Token?"
Token ID1701
Feature activation-4.17e-03

Pos logit contributions
 am+0.093
 loser+0.080
ses+0.077
kin+0.074
 hurts+0.073
Neg logit contributions
years-0.188
 airplanes-0.184
 chatting-0.181
Offic-0.179
 travelling-0.174
Token"
Token ID1
Feature activation-3.66e-04

Pos logit contributions
 am+0.005
ses+0.004
 loser+0.004
 fault+0.004
kin+0.004
Neg logit contributions
years-0.004
Offic-0.004
 airplanes-0.004
 chatting-0.004
 inviting-0.004
TokenOh
Token ID5812
Feature activation-7.50e-02

Pos logit contributions
 am+0.001
 fault+0.001
ses+0.001
 loser+0.001
 hurts+0.001
Neg logit contributions
years-0.001
Offic-0.001
 inviting-0.001
 chatting-0.001
 airplanes-0.001
Token,
Token ID11
Feature activation-1.58e-02

Pos logit contributions
ro+0.134
 am+0.127
kin+0.124
ger+0.123
ses+0.119
Neg logit contributions
years-0.119
 airplanes-0.112
Offic-0.107
 inviting-0.107
 chatting-0.107
Token no
Token ID645
Feature activation-3.21e-02

Pos logit contributions
 am+0.028
ro+0.025
 loser+0.025
kin+0.025
ses+0.025
Neg logit contributions
years-0.026
 airplanes-0.025
 inviting-0.025
Offic-0.024
 chatting-0.023
Token!
Token ID0
Feature activation-4.21e-02

Pos logit contributions
 am+0.055
kin+0.055
ses+0.055
 loser+0.052
inter+0.050
Neg logit contributions
 airplanes-0.052
 chatting-0.050
 travelling-0.049
 inviting-0.049
 visiting-0.047
Token Where
Token ID6350
Feature activation-1.40e-01

Pos logit contributions
 am+0.069
kin+0.069
ger+0.066
chy+0.065
 lesson+0.062
Neg logit contributions
years-0.070
Offic-0.065
 visiting-0.058
 libraries-0.057
ents-0.056
Token did
Token ID750
Feature activation-7.87e-02

Pos logit contributions
 loser+0.216
ses+0.216
ro+0.214
ger+0.205
mer+0.192
Neg logit contributions
 airplanes-0.238
years-0.230
ons-0.223
 libraries-0.222
Offic-0.218
Token you
Token ID345
Feature activation-9.65e-02

Pos logit contributions
 am+0.131
 loser+0.125
ses+0.121
kin+0.117
 fault+0.114
Neg logit contributions
 airplanes-0.144
years-0.140
Offic-0.136
 chatting-0.136
 travelling-0.133
Token lose
Token ID4425
Feature activation-4.82e-02

Pos logit contributions
 am+0.150
 loser+0.139
ses+0.135
kin+0.131
 hurts+0.126
Neg logit contributions
years-0.189
 airplanes-0.184
 chatting-0.180
Offic-0.178
 travelling-0.173
Token it
Token ID340
Feature activation-1.04e-01

Pos logit contributions
 am+0.080
 loser+0.076
ses+0.075
kin+0.071
 fault+0.069
Neg logit contributions
 airplanes-0.089
years-0.084
Offic-0.082
 inviting-0.082
 chatting-0.080
Token?"
Token ID1701
Feature activation-4.84e-03

Pos logit contributions
ses+0.153
 am+0.148
 loser+0.147
kin+0.131
 fault+0.128
Neg logit contributions
 airplanes-0.205
 chatting-0.193
 travelling-0.191
 inviting-0.189
 visiting-0.188
Token sad
Token ID6507
Feature activation-6.54e-02

Pos logit contributions
 am+0.083
 loser+0.075
ses+0.072
 hurts+0.072
kin+0.072
Neg logit contributions
years-0.077
 airplanes-0.073
 chatting-0.073
Offic-0.070
 travelling-0.068
Token.
Token ID13
Feature activation-3.80e-02

Pos logit contributions
 am+0.109
 loser+0.104
kin+0.102
ses+0.098
+0.093
Neg logit contributions
 airplanes-0.120
 chatting-0.117
 inviting-0.113
 visiting-0.107
 travelling-0.105
Token
Token ID198
Feature activation-1.43e-02

Pos logit contributions
 am+0.058
 loser+0.052
kin+0.051
ses+0.050
 hurts+0.049
Neg logit contributions
years-0.077
 chatting-0.073
Offic-0.073
 airplanes-0.072
 inviting-0.070
Token
Token ID198
Feature activation-1.38e-02

Pos logit contributions
 am+0.020
 loser+0.017
 hurts+0.016
ses+0.016
kin+0.016
Neg logit contributions
years-0.032
 airplanes-0.031
 chatting-0.030
Offic-0.030
 travelling-0.029
Token"
Token ID1
Feature activation+1.95e-03

Pos logit contributions
 am+0.024
 loser+0.021
kin+0.020
 hurts+0.019
ger+0.019
Neg logit contributions
 airplanes-0.026
years-0.026
 chatting-0.025
Offic-0.024
 travelling-0.024
TokenWhere
Token ID8496
Feature activation-1.40e-01

Pos logit contributions
years+0.003
Offic+0.003
 chatting+0.003
 airplanes+0.003
 inviting+0.003
Neg logit contributions
 am-0.004
 loser-0.003
ses-0.003
 hurts-0.003
 fault-0.003
Token is
Token ID318
Feature activation-6.50e-02

Pos logit contributions
 loser+0.188
ro+0.187
ses+0.184
ger+0.181
 am+0.166
Neg logit contributions
 airplanes-0.268
years-0.257
 libraries-0.253
 chatting-0.253
 talked-0.248
Token the
Token ID262
Feature activation-4.15e-02

Pos logit contributions
 am+0.111
 loser+0.098
ses+0.094
kin+0.094
ger+0.090
Neg logit contributions
years-0.123
 airplanes-0.123
 chatting-0.119
Offic-0.115
 travelling-0.113
Token fish
Token ID5916
Feature activation-2.30e-02

Pos logit contributions
 am+0.070
kin+0.060
ses+0.059
 hurts+0.059
 loser+0.059
Neg logit contributions
 airplanes-0.076
 chatting-0.071
Offic-0.071
 travelling-0.070
years-0.070
Token?"
Token ID1701
Feature activation-1.13e-02

Pos logit contributions
 am+0.029
ses+0.029
 loser+0.028
kin+0.026
 fault+0.025
Neg logit contributions
 airplanes-0.048
 chatting-0.048
years-0.047
 travelling-0.046
 visiting-0.045
Token Tom
Token ID4186
Feature activation-1.61e-02

Pos logit contributions
 am+0.018
 loser+0.017
ses+0.016
kin+0.016
 hurts+0.015
Neg logit contributions
years-0.022
 airplanes-0.021
 chatting-0.021
Offic-0.021
 travelling-0.020
Token someone
Token ID2130
Feature activation-3.77e-02

Pos logit contributions
years+0.029
 airplanes+0.028
 chatting+0.028
Offic+0.027
 travelling+0.026
Neg logit contributions
 am-0.028
 loser-0.025
ses-0.024
 hurts-0.024
kin-0.024
Token called
Token ID1444
Feature activation+3.53e-02

Pos logit contributions
 am+0.070
 hurts+0.061
 loser+0.059
ger+0.059
kin+0.059
Neg logit contributions
years-0.063
Offic-0.061
 airplanes-0.060
 terminal-0.057
 chatting-0.057
Token out
Token ID503
Feature activation-2.43e-02

Pos logit contributions
years+0.056
 chatting+0.054
 airplanes+0.053
 travelling+0.051
Offic+0.051
Neg logit contributions
 am-0.070
 loser-0.064
 hurts-0.062
ses-0.062
kin-0.061
Token:
Token ID25
Feature activation+1.64e-02

Pos logit contributions
 am+0.036
 loser+0.032
ses+0.031
kin+0.031
 hurts+0.030
Neg logit contributions
years-0.051
 chatting-0.049
 airplanes-0.049
Offic-0.048
 travelling-0.046
Token "
Token ID366
Feature activation+5.18e-03

Pos logit contributions
years+0.026
 chatting+0.026
Offic+0.025
 travelling+0.024
 airplanes+0.024
Neg logit contributions
 am-0.031
 loser-0.029
kin-0.028
ses-0.028
 hurts-0.028
TokenWhere
Token ID8496
Feature activation-1.40e-01

Pos logit contributions
years+0.010
 airplanes+0.010
 chatting+0.009
Offic+0.009
 travelling+0.009
Neg logit contributions
 am-0.009
 loser-0.008
 hurts-0.007
kin-0.007
ses-0.007
Token do
Token ID466
Feature activation-8.68e-02

Pos logit contributions
 loser+0.168
ses+0.164
ro+0.158
 am+0.155
ger+0.153
Neg logit contributions
 airplanes-0.293
years-0.291
 chatting-0.286
 libraries-0.280
 travelling-0.279
Token you
Token ID345
Feature activation-1.02e-01

Pos logit contributions
 am+0.145
ses+0.138
 loser+0.135
kin+0.130
 fault+0.129
Neg logit contributions
 airplanes-0.156
years-0.155
Offic-0.151
 chatting-0.150
 travelling-0.148
Token think
Token ID892
Feature activation-3.86e-02

Pos logit contributions
 am+0.149
 loser+0.136
ses+0.132
kin+0.129
 hurts+0.126
Neg logit contributions
years-0.211
 chatting-0.202
 airplanes-0.202
Offic-0.196
 travelling-0.194
Token you
Token ID345
Feature activation-9.56e-02

Pos logit contributions
 am+0.054
 loser+0.047
ses+0.044
kin+0.044
 hurts+0.044
Neg logit contributions
years-0.085
 airplanes-0.081
 chatting-0.081
Offic-0.080
 travelling-0.077
Token're
Token ID821
Feature activation-2.19e-02

Pos logit contributions
 am+0.120
 loser+0.110
ses+0.106
kin+0.104
 hurts+0.099
Neg logit contributions
years-0.214
 airplanes-0.209
 chatting-0.206
Offic-0.202
 travelling-0.199
Token empty
Token ID6565
Feature activation-5.20e-02

Pos logit contributions
 am+0.047
 loser+0.041
ses+0.040
 hurts+0.039
kin+0.038
Neg logit contributions
years-0.073
 airplanes-0.070
 chatting-0.069
Offic-0.069
 travelling-0.066
Token.
Token ID13
Feature activation-3.66e-02

Pos logit contributions
 am+0.091
 loser+0.084
ses+0.081
kin+0.078
 hurts+0.075
Neg logit contributions
 airplanes-0.093
 chatting-0.090
years-0.084
 travelling-0.084
 inviting-0.083
Token
Token ID198
Feature activation-1.03e-02

Pos logit contributions
 am+0.057
 loser+0.050
ses+0.048
kin+0.048
 hurts+0.048
Neg logit contributions
years-0.075
 airplanes-0.071
 chatting-0.071
Offic-0.070
 travelling-0.067
Token
Token ID198
Feature activation-1.13e-02

Pos logit contributions
 am+0.015
 loser+0.012
kin+0.011
 hurts+0.011
ses+0.011
Neg logit contributions
years-0.023
 airplanes-0.022
 chatting-0.022
Offic-0.021
 travelling-0.020
Token"
Token ID1
Feature activation-2.03e-04

Pos logit contributions
 am+0.020
 loser+0.018
kin+0.017
 hurts+0.016
ger+0.016
Neg logit contributions
 airplanes-0.020
years-0.020
 chatting-0.020
 travelling-0.018
Offic-0.018
TokenWhere
Token ID8496
Feature activation-1.40e-01

Pos logit contributions
 am+0.000
 loser+0.000
ses+0.000
 hurts+0.000
kin+0.000
Neg logit contributions
years-0.000
 airplanes-0.000
Offic-0.000
 chatting-0.000
 travelling-0.000
Token is
Token ID318
Feature activation-6.57e-02

Pos logit contributions
 loser+0.187
ses+0.182
ro+0.180
ger+0.172
 am+0.167
Neg logit contributions
 airplanes-0.276
years-0.261
 libraries-0.259
 chatting-0.256
 talked-0.251
Token everyone
Token ID2506
Feature activation-6.58e-02

Pos logit contributions
 am+0.105
 loser+0.092
ses+0.088
kin+0.087
 hurts+0.086
Neg logit contributions
years-0.133
 airplanes-0.127
 chatting-0.126
Offic-0.124
 travelling-0.119
Token?"
Token ID1701
Feature activation-8.74e-03

Pos logit contributions
 am+0.094
 loser+0.085
ses+0.082
kin+0.080
 hurts+0.078
Neg logit contributions
years-0.139
 chatting-0.135
 airplanes-0.135
Offic-0.130
 travelling-0.129
Token James
Token ID3700
Feature activation-1.77e-02

Pos logit contributions
 am+0.015
 loser+0.013
ses+0.013
kin+0.012
 hurts+0.012
Neg logit contributions
years-0.017
 airplanes-0.016
 chatting-0.016
Offic-0.016
 travelling-0.015
Token asked
Token ID1965
Feature activation+3.52e-02

Pos logit contributions
 am+0.041
 loser+0.037
 hurts+0.037
ses+0.036
kin+0.036
Neg logit contributions
years-0.023
 airplanes-0.021
Offic-0.021
 chatting-0.021
 travelling-0.019
Token one
Token ID530
Feature activation-3.93e-02

Pos logit contributions
 am+0.024
 loser+0.022
kin+0.021
ses+0.021
 hurts+0.021
Neg logit contributions
years-0.022
 airplanes-0.021
 chatting-0.020
Offic-0.020
 travelling-0.019
Token.
Token ID13
Feature activation-3.03e-02

Pos logit contributions
 am+0.065
 loser+0.062
ses+0.059
kin+0.058
 hurts+0.057
Neg logit contributions
 airplanes-0.065
 travelling-0.065
 chatting-0.063
 inviting-0.063
 visiting-0.062
Token
Token ID198
Feature activation-1.24e-02

Pos logit contributions
 am+0.048
 loser+0.043
kin+0.042
ses+0.041
 hurts+0.041
Neg logit contributions
years-0.060
Offic-0.057
 chatting-0.057
 airplanes-0.056
 travelling-0.054
Token
Token ID198
Feature activation-1.33e-02

Pos logit contributions
 am+0.019
 loser+0.016
ses+0.015
 hurts+0.015
kin+0.015
Neg logit contributions
years-0.026
 airplanes-0.025
 chatting-0.025
Offic-0.024
 travelling-0.023
Token"
Token ID1
Feature activation+5.60e-04

Pos logit contributions
 am+0.026
 loser+0.023
kin+0.022
 hurts+0.021
ger+0.021
Neg logit contributions
 airplanes-0.022
years-0.022
 chatting-0.021
Offic-0.020
 travelling-0.020
TokenWhere
Token ID8496
Feature activation-1.40e-01

Pos logit contributions
years+0.001
Offic+0.001
 airplanes+0.001
 chatting+0.001
 travelling+0.001
Neg logit contributions
 am-0.001
 loser-0.001
ses-0.001
 hurts-0.001
kin-0.001
Token is
Token ID318
Feature activation-6.48e-02

Pos logit contributions
 loser+0.183
ro+0.180
ses+0.180
ger+0.171
 am+0.169
Neg logit contributions
 airplanes-0.274
years-0.266
 libraries-0.259
 chatting-0.257
Offic-0.252
Token the
Token ID262
Feature activation-4.07e-02

Pos logit contributions
 am+0.104
 loser+0.092
ses+0.088
kin+0.087
 hurts+0.086
Neg logit contributions
years-0.130
 airplanes-0.125
 chatting-0.123
Offic-0.122
 travelling-0.117
Token key
Token ID1994
Feature activation-5.95e-02

Pos logit contributions
 am+0.070
kin+0.060
ses+0.060
 hurts+0.059
 loser+0.059
Neg logit contributions
 airplanes-0.074
years-0.070
Offic-0.070
 chatting-0.069
 travelling-0.069
Token?"
Token ID1701
Feature activation-1.15e-02

Pos logit contributions
 am+0.080
ses+0.080
 loser+0.078
kin+0.072
ger+0.070
Neg logit contributions
 airplanes-0.123
 chatting-0.115
years-0.115
 travelling-0.115
 visiting-0.112
Token Ben
Token ID3932
Feature activation-2.05e-02

Pos logit contributions
 am+0.019
 loser+0.017
ses+0.016
kin+0.016
 hurts+0.016
Neg logit contributions
years-0.022
 airplanes-0.021
 chatting-0.021
Offic-0.021
 travelling-0.020
Token goes
Token ID2925
Feature activation-1.43e-02

Pos logit contributions
 am+0.011
 loser+0.010
ses+0.009
kin+0.009
 fault+0.008
Neg logit contributions
 airplanes-0.031
years-0.031
 chatting-0.030
Offic-0.030
 inviting-0.029
Token.
Token ID13
Feature activation-2.33e-02

Pos logit contributions
 am+0.020
ses+0.019
 loser+0.019
kin+0.018
 fault+0.017
Neg logit contributions
 airplanes-0.030
years-0.029
 chatting-0.029
 travelling-0.029
 inviting-0.028
Token
Token ID198
Feature activation-8.10e-03

Pos logit contributions
 am+0.036
 loser+0.032
kin+0.031
 hurts+0.031
ses+0.031
Neg logit contributions
years-0.047
Offic-0.045
 chatting-0.045
 airplanes-0.044
 cities-0.042
Token
Token ID198
Feature activation-1.07e-02

Pos logit contributions
 am+0.011
 loser+0.008
 hurts+0.008
ses+0.008
kin+0.008
Neg logit contributions
years-0.019
 airplanes-0.018
 chatting-0.018
Offic-0.018
 travelling-0.017
Token"
Token ID1
Feature activation-2.64e-03

Pos logit contributions
 am+0.018
 loser+0.016
kin+0.014
ger+0.013
 hurts+0.013
Neg logit contributions
years-0.022
 airplanes-0.021
 chatting-0.021
 travelling-0.019
 cities-0.019
TokenWhere
Token ID8496
Feature activation-1.40e-01

Pos logit contributions
 am+0.005
 loser+0.004
ses+0.004
 hurts+0.004
kin+0.004
Neg logit contributions
years-0.005
 airplanes-0.005
Offic-0.005
 chatting-0.004
 travelling-0.004
Token does
Token ID857
Feature activation-9.78e-02

Pos logit contributions
 loser+0.164
ro+0.160
ses+0.159
ger+0.155
 am+0.153
Neg logit contributions
 airplanes-0.289
years-0.288
Offic-0.277
 chatting-0.276
 libraries-0.274
Token the
Token ID262
Feature activation-3.32e-02

Pos logit contributions
 am+0.166
 loser+0.149
ses+0.145
kin+0.142
 hurts+0.140
Neg logit contributions
years-0.183
 airplanes-0.179
 chatting-0.174
Offic-0.174
 travelling-0.167
Token slide
Token ID10649
Feature activation-1.97e-02

Pos logit contributions
 am+0.053
ses+0.048
kin+0.048
 loser+0.046
 hurts+0.045
Neg logit contributions
 airplanes-0.060
 chatting-0.058
Offic-0.057
years-0.057
 inviting-0.057
Token go
Token ID467
Feature activation-1.71e-02

Pos logit contributions
 am+0.025
ses+0.023
 loser+0.023
kin+0.022
 fault+0.021
Neg logit contributions
 airplanes-0.043
years-0.042
 chatting-0.041
Offic-0.041
 travelling-0.040
Token?"
Token ID1701
Feature activation-1.12e-02

Pos logit contributions
 loser+0.022
ses+0.022
 am+0.022
kin+0.021
 fault+0.020
Neg logit contributions
 airplanes-0.037
years-0.035
 chatting-0.035
Offic-0.035
 inviting-0.035
Token"
Token ID1
Feature activation-5.40e-03

Pos logit contributions
 am+0.017
 loser+0.015
kin+0.014
 hurts+0.014
ger+0.014
Neg logit contributions
years-0.020
 airplanes-0.020
 chatting-0.020
Offic-0.019
 travelling-0.019
TokenHello
Token ID15496
Feature activation-5.02e-02

Pos logit contributions
 am+0.010
 fault+0.009
ses+0.009
 loser+0.009
 hurts+0.009
Neg logit contributions
Offic-0.009
years-0.009
 airplanes-0.009
 chatting-0.008
 inviting-0.008
Token,
Token ID11
Feature activation-7.15e-03

Pos logit contributions
 am+0.084
ses+0.083
 loser+0.081
 fault+0.079
kin+0.078
Neg logit contributions
 airplanes-0.086
 travelling-0.083
 inviting-0.082
 chatting-0.081
Offic-0.081
Token puppy
Token ID26188
Feature activation-5.87e-03

Pos logit contributions
 am+0.012
 loser+0.011
kin+0.011
ses+0.011
 fault+0.011
Neg logit contributions
Offic-0.012
 airplanes-0.012
years-0.012
 inviting-0.012
 chatting-0.012
Token!
Token ID0
Feature activation-4.02e-02

Pos logit contributions
 fault+0.010
 loser+0.010
ses+0.010
 am+0.009
kin+0.009
Neg logit contributions
 airplanes-0.010
Offic-0.010
 inviting-0.009
 chatting-0.009
years-0.009
Token Where
Token ID6350
Feature activation-1.40e-01

Pos logit contributions
kin+0.069
 fault+0.066
 am+0.064
 loser+0.062
chy+0.061
Neg logit contributions
years-0.070
Offic-0.067
 visiting-0.064
 chatting-0.063
 travelling-0.062
Token did
Token ID750
Feature activation-7.50e-02

Pos logit contributions
 loser+0.175
ses+0.172
ro+0.165
 fault+0.154
ger+0.154
Neg logit contributions
 airplanes-0.280
years-0.271
Offic-0.267
 libraries-0.266
 chatting-0.266
Token you
Token ID345
Feature activation-9.45e-02

Pos logit contributions
 am+0.123
 loser+0.117
ses+0.114
kin+0.111
 fault+0.109
Neg logit contributions
 airplanes-0.137
years-0.137
 chatting-0.132
Offic-0.132
 travelling-0.130
Token come
Token ID1282
Feature activation-4.70e-02

Pos logit contributions
 am+0.150
 loser+0.140
ses+0.136
kin+0.132
 hurts+0.129
Neg logit contributions
years-0.182
 airplanes-0.177
 chatting-0.175
Offic-0.171
 travelling-0.168
Token from
Token ID422
Feature activation-2.05e-02

Pos logit contributions
 am+0.069
 loser+0.064
ses+0.063
kin+0.062
 hurts+0.058
Neg logit contributions
years-0.095
 airplanes-0.095
 chatting-0.093
Offic-0.091
 travelling-0.090
Token?"
Token ID1701
Feature activation-8.08e-03

Pos logit contributions
 am+0.028
 loser+0.027
ses+0.025
kin+0.024
 fault+0.023
Neg logit contributions
 airplanes-0.044
years-0.043
 chatting-0.043
Offic-0.042
 travelling-0.042
Token,
Token ID11
Feature activation+3.76e-02

Pos logit contributions
years+0.061
 chatting+0.058
 inviting+0.056
 airplanes+0.055
 travelling+0.055
Neg logit contributions
 am-0.063
 loser-0.062
ses-0.062
 fault-0.060
kin-0.059
Token "
Token ID366
Feature activation+1.43e-02

Pos logit contributions
years+0.062
 chatting+0.059
 airplanes+0.058
Offic+0.058
 travelling+0.055
Neg logit contributions
 am-0.073
 loser-0.066
ses-0.064
kin-0.064
 hurts-0.063
TokenOh
Token ID5812
Feature activation-7.83e-02

Pos logit contributions
years+0.025
Offic+0.024
 airplanes+0.023
 chatting+0.023
 inviting+0.022
Neg logit contributions
 am-0.027
 loser-0.024
ses-0.024
kin-0.024
 hurts-0.023
Token my
Token ID616
Feature activation-3.81e-02

Pos logit contributions
 am+0.135
ro+0.134
kin+0.127
 loser+0.125
ger+0.123
Neg logit contributions
years-0.127
 chatting-0.118
Offic-0.117
 travelling-0.117
 inviting-0.117
Token!
Token ID0
Feature activation-4.58e-02

Pos logit contributions
 am+0.065
 loser+0.060
kin+0.059
ro+0.057
onna+0.055
Neg logit contributions
 inviting-0.062
years-0.060
 chatting-0.060
 airplanes-0.060
 travelling-0.059
Token Where
Token ID6350
Feature activation-1.39e-01

Pos logit contributions
 am+0.073
kin+0.072
 loser+0.067
 fault+0.066
ses+0.063
Neg logit contributions
years-0.085
Offic-0.084
 chatting-0.076
 visiting-0.074
 inviting-0.073
Token did
Token ID750
Feature activation-7.80e-02

Pos logit contributions
 loser+0.163
ses+0.156
 am+0.153
ro+0.148
ger+0.144
Neg logit contributions
 airplanes-0.297
years-0.294
Offic-0.288
 chatting-0.286
 travelling-0.279
Token you
Token ID345
Feature activation-9.38e-02

Pos logit contributions
 am+0.124
 loser+0.115
ses+0.112
kin+0.109
 fault+0.105
Neg logit contributions
years-0.149
 airplanes-0.149
Offic-0.144
 chatting-0.143
 travelling-0.140
Token come
Token ID1282
Feature activation-4.57e-02

Pos logit contributions
 am+0.148
 loser+0.137
ses+0.133
kin+0.130
 hurts+0.125
Neg logit contributions
years-0.183
 airplanes-0.177
 chatting-0.175
Offic-0.171
 travelling-0.168
Token from
Token ID422
Feature activation-1.83e-02

Pos logit contributions
 am+0.068
 loser+0.061
ses+0.060
kin+0.060
 hurts+0.057
Neg logit contributions
years-0.094
 airplanes-0.092
 chatting-0.091
Offic-0.089
 travelling-0.087
Token?"
Token ID1701
Feature activation+3.42e-03

Pos logit contributions
 am+0.024
 loser+0.022
ses+0.020
kin+0.020
 hurts+0.019
Neg logit contributions
 airplanes-0.041
years-0.041
 chatting-0.040
Offic-0.040
 travelling-0.039
Token woods
Token ID16479
Feature activation-2.14e-02

Pos logit contributions
 am+0.030
 loser+0.027
ses+0.026
kin+0.026
 hurts+0.025
Neg logit contributions
years-0.028
 airplanes-0.028
 chatting-0.027
Offic-0.027
 cities-0.026
Token.
Token ID13
Feature activation-3.61e-02

Pos logit contributions
 am+0.027
 loser+0.023
ses+0.023
kin+0.022
 hurts+0.022
Neg logit contributions
years-0.049
 airplanes-0.048
 chatting-0.047
Offic-0.047
 travelling-0.045
Token
Token ID198
Feature activation-1.29e-02

Pos logit contributions
 am+0.055
 loser+0.048
kin+0.046
 hurts+0.046
ses+0.046
Neg logit contributions
years-0.074
Offic-0.071
 chatting-0.069
 airplanes-0.069
 travelling-0.066
Token
Token ID198
Feature activation-1.61e-02

Pos logit contributions
 am+0.020
 loser+0.017
ses+0.016
 hurts+0.016
kin+0.016
Neg logit contributions
years-0.027
 airplanes-0.026
 chatting-0.026
Offic-0.025
 travelling-0.024
Token"
Token ID1
Feature activation-8.78e-03

Pos logit contributions
 am+0.026
 loser+0.023
kin+0.021
 hurts+0.020
ger+0.020
Neg logit contributions
years-0.033
 airplanes-0.032
 chatting-0.032
 travelling-0.030
Offic-0.030
TokenWhere
Token ID8496
Feature activation-1.39e-01

Pos logit contributions
 am+0.016
 loser+0.014
ses+0.014
 hurts+0.014
kin+0.013
Neg logit contributions
years-0.016
Offic-0.015
 airplanes-0.015
 chatting-0.015
 terminal-0.014
Token did
Token ID750
Feature activation-8.17e-02

Pos logit contributions
 loser+0.171
ro+0.169
ses+0.168
ger+0.164
 am+0.159
Neg logit contributions
 airplanes-0.278
years-0.274
Offic-0.264
 chatting-0.263
 libraries-0.263
Token it
Token ID340
Feature activation-1.07e-01

Pos logit contributions
 am+0.148
 loser+0.145
ses+0.143
kin+0.138
 fault+0.135
Neg logit contributions
 airplanes-0.137
 travelling-0.127
Offic-0.126
 chatting-0.124
 inviting-0.124
Token go
Token ID467
Feature activation-2.24e-02

Pos logit contributions
 loser+0.160
 am+0.156
ses+0.156
kin+0.148
 fault+0.146
Neg logit contributions
 airplanes-0.205
 chatting-0.196
Offic-0.192
 travelling-0.191
 inviting-0.189
Token?"
Token ID1701
Feature activation-1.21e-02

Pos logit contributions
 am+0.030
ses+0.029
 loser+0.029
kin+0.028
 fault+0.027
Neg logit contributions
 airplanes-0.048
years-0.046
 chatting-0.045
 travelling-0.045
 inviting-0.045
Token Tom
Token ID4186
Feature activation-1.64e-02

Pos logit contributions
 am+0.020
 loser+0.019
kin+0.018
 fault+0.018
ses+0.017
Neg logit contributions
years-0.022
 chatting-0.021
 airplanes-0.021
Offic-0.021
 inviting-0.020
Token tree
Token ID5509
Feature activation-1.22e-02

Pos logit contributions
 am+0.021
ses+0.019
 loser+0.019
kin+0.018
 hurts+0.018
Neg logit contributions
 airplanes-0.021
Offic-0.020
years-0.020
 chatting-0.020
 inviting-0.019
Token.
Token ID13
Feature activation-2.47e-02

Pos logit contributions
 am+0.015
 loser+0.013
ses+0.013
kin+0.013
 hurts+0.012
Neg logit contributions
years-0.028
 chatting-0.027
 airplanes-0.027
Offic-0.027
 travelling-0.026
Token
Token ID198
Feature activation-9.79e-03

Pos logit contributions
 am+0.036
 loser+0.032
ses+0.030
 hurts+0.030
kin+0.030
Neg logit contributions
years-0.053
 airplanes-0.050
 chatting-0.050
Offic-0.050
 travelling-0.047
Token
Token ID198
Feature activation-1.25e-02

Pos logit contributions
 am+0.013
 loser+0.010
 hurts+0.009
kin+0.009
ses+0.009
Neg logit contributions
years-0.023
 airplanes-0.022
 chatting-0.022
Offic-0.021
 travelling-0.021
Token"
Token ID1
Feature activation-2.14e-04

Pos logit contributions
 am+0.021
 loser+0.018
kin+0.017
ger+0.016
 hurts+0.016
Neg logit contributions
 airplanes-0.025
years-0.024
 chatting-0.024
 travelling-0.022
 cities-0.022
TokenWhere
Token ID8496
Feature activation-1.39e-01

Pos logit contributions
 am+0.000
 loser+0.000
ses+0.000
 hurts+0.000
kin+0.000
Neg logit contributions
years-0.000
 airplanes-0.000
 chatting-0.000
Offic-0.000
 travelling-0.000
Token is
Token ID318
Feature activation-6.62e-02

Pos logit contributions
 loser+0.174
ro+0.169
ses+0.169
ger+0.160
 am+0.159
Neg logit contributions
 airplanes-0.278
years-0.275
 chatting-0.265
 libraries-0.263
Offic-0.263
Token the
Token ID262
Feature activation-4.29e-02

Pos logit contributions
 am+0.106
 loser+0.094
ses+0.089
kin+0.089
 hurts+0.088
Neg logit contributions
years-0.133
 airplanes-0.128
 chatting-0.126
Offic-0.125
 travelling-0.120
Token ball
Token ID2613
Feature activation-3.14e-02

Pos logit contributions
 am+0.071
kin+0.061
 hurts+0.059
ses+0.059
 loser+0.059
Neg logit contributions
 airplanes-0.080
Offic-0.075
 chatting-0.075
 travelling-0.074
 inviting-0.073
Token,
Token ID11
Feature activation-7.32e-03

Pos logit contributions
ses+0.043
 am+0.043
 loser+0.041
 fault+0.040
kin+0.040
Neg logit contributions
 airplanes-0.064
 chatting-0.060
 travelling-0.058
years-0.058
Offic-0.058
Token Spot
Token ID15899
Feature activation-3.34e-02

Pos logit contributions
 am+0.012
 loser+0.010
ses+0.010
 hurts+0.010
kin+0.010
Neg logit contributions
years-0.014
 airplanes-0.014
Offic-0.014
 chatting-0.014
 inviting-0.013
Token part
Token ID636
Feature activation-4.56e-02

Pos logit contributions
 am+0.040
kin+0.034
 loser+0.034
ses+0.032
ger+0.032
Neg logit contributions
 airplanes-0.042
 chatting-0.039
years-0.038
 cities-0.037
 travelling-0.037
Token.
Token ID13
Feature activation-2.66e-02

Pos logit contributions
 am+0.085
ses+0.083
 loser+0.080
 fault+0.075
 hurts+0.074
Neg logit contributions
 airplanes-0.069
 chatting-0.068
years-0.064
 cities-0.062
 inviting-0.061
Token
Token ID198
Feature activation-1.04e-02

Pos logit contributions
 am+0.039
 loser+0.034
kin+0.033
 hurts+0.033
ses+0.033
Neg logit contributions
years-0.056
Offic-0.053
 chatting-0.053
 airplanes-0.053
 inviting-0.050
Token
Token ID198
Feature activation-1.19e-02

Pos logit contributions
 am+0.015
 loser+0.013
ses+0.012
 hurts+0.012
kin+0.012
Neg logit contributions
years-0.022
 airplanes-0.021
 chatting-0.021
Offic-0.021
 travelling-0.020
Token"
Token ID1
Feature activation-1.70e-03

Pos logit contributions
 am+0.022
 loser+0.019
kin+0.018
 hurts+0.018
ses+0.017
Neg logit contributions
years-0.022
 airplanes-0.021
 chatting-0.021
Offic-0.019
 travelling-0.019
TokenWhere
Token ID8496
Feature activation-1.39e-01

Pos logit contributions
 am+0.003
 loser+0.003
ses+0.003
kin+0.003
 hurts+0.003
Neg logit contributions
years-0.003
Offic-0.003
 airplanes-0.003
 chatting-0.003
 cities-0.003
Token is
Token ID318
Feature activation-6.11e-02

Pos logit contributions
 loser+0.161
ro+0.158
ses+0.158
ger+0.153
 am+0.147
Neg logit contributions
 airplanes-0.288
years-0.284
Offic-0.275
 libraries-0.274
 chatting-0.273
Token it
Token ID340
Feature activation-1.13e-01

Pos logit contributions
 am+0.110
 loser+0.099
kin+0.096
ses+0.095
ro+0.093
Neg logit contributions
 airplanes-0.109
years-0.108
 chatting-0.104
Offic-0.101
 travelling-0.101
Token?"
Token ID1701
Feature activation-1.18e-02

Pos logit contributions
 loser+0.161
 am+0.156
ses+0.153
ro+0.140
 fault+0.138
Neg logit contributions
 airplanes-0.225
 chatting-0.220
 travelling-0.215
 visiting-0.214
 inviting-0.213
Token Anna
Token ID11735
Feature activation-2.29e-02

Pos logit contributions
 am+0.018
 loser+0.016
kin+0.016
ses+0.016
 fault+0.015
Neg logit contributions
years-0.024
 chatting-0.023
 airplanes-0.022
Offic-0.022
 inviting-0.021
Token wonders
Token ID19294
Feature activation-1.66e-04

Pos logit contributions
 am+0.045
 loser+0.041
kin+0.039
ses+0.039
 hurts+0.039
Neg logit contributions
years-0.037
 airplanes-0.035
 chatting-0.035
Offic-0.034
 travelling-0.032
Token,
Token ID11
Feature activation+3.20e-02

Pos logit contributions
 am+0.043
 loser+0.039
kin+0.038
ses+0.036
 hurts+0.036
Neg logit contributions
years-0.061
 chatting-0.060
 airplanes-0.060
 travelling-0.058
Offic-0.057
Token "
Token ID366
Feature activation+4.10e-03

Pos logit contributions
years+0.068
 airplanes+0.063
Offic+0.062
 terminal+0.059
 chatting+0.059
Neg logit contributions
 am-0.049
ses-0.040
 loser-0.040
 hurts-0.039
ger-0.037
TokenHi
Token ID17250
Feature activation-4.86e-02

Pos logit contributions
years+0.008
 airplanes+0.008
 chatting+0.008
Offic+0.007
 travelling+0.007
Neg logit contributions
 am-0.007
 loser-0.006
 hurts-0.005
kin-0.005
ger-0.005
Token Tom
Token ID4186
Feature activation-5.25e-02

Pos logit contributions
 am+0.081
ses+0.075
 loser+0.072
 hurts+0.072
kin+0.069
Neg logit contributions
 airplanes-0.089
years-0.086
Offic-0.085
 chatting-0.085
 travelling-0.084
Token!
Token ID0
Feature activation-4.71e-02

Pos logit contributions
 am+0.089
ses+0.082
 loser+0.078
kin+0.075
 hurts+0.075
Neg logit contributions
 airplanes-0.094
 chatting-0.093
Offic-0.092
years-0.091
 visiting-0.089
Token Where
Token ID6350
Feature activation-1.38e-01

Pos logit contributions
 am+0.072
kin+0.069
ses+0.066
 hurts+0.065
 fault+0.065
Neg logit contributions
years-0.093
Offic-0.091
 chatting-0.086
 visiting-0.083
 travelling-0.083
Token did
Token ID750
Feature activation-7.53e-02

Pos logit contributions
ses+0.170
 loser+0.169
ro+0.158
 am+0.156
ger+0.154
Neg logit contributions
 airplanes-0.284
years-0.281
Offic-0.277
 chatting-0.272
 travelling-0.268
Token you
Token ID345
Feature activation-8.93e-02

Pos logit contributions
 am+0.118
 loser+0.108
ses+0.106
kin+0.103
 hurts+0.098
Neg logit contributions
years-0.147
 airplanes-0.146
Offic-0.141
 chatting-0.141
 travelling-0.137
Token find
Token ID1064
Feature activation-1.64e-02

Pos logit contributions
 am+0.145
 loser+0.130
ses+0.126
kin+0.124
 hurts+0.123
Neg logit contributions
years-0.175
 airplanes-0.168
 chatting-0.166
Offic-0.164
 travelling-0.158
Token that
Token ID326
Feature activation-3.78e-02

Pos logit contributions
 am+0.029
 loser+0.026
ses+0.025
kin+0.025
 hurts+0.024
Neg logit contributions
years-0.030
 airplanes-0.029
 chatting-0.028
Offic-0.028
 travelling-0.027
Token onion
Token ID21670
Feature activation-2.70e-02

Pos logit contributions
ses+0.055
 am+0.055
 loser+0.053
ro+0.051
kin+0.048
Neg logit contributions
 airplanes-0.074
 inviting-0.071
 chatting-0.070
 travelling-0.069
Offic-0.069
TokenOh
Token ID5812
Feature activation-7.82e-02

Pos logit contributions
 am+0.005
 loser+0.004
ses+0.004
kin+0.004
 hurts+0.004
Neg logit contributions
years-0.005
Offic-0.005
 airplanes-0.005
 chatting-0.005
 cities-0.004
Token no
Token ID645
Feature activation-3.26e-02

Pos logit contributions
ro+0.130
 am+0.130
kin+0.123
ger+0.121
chy+0.119
Neg logit contributions
years-0.130
 airplanes-0.123
Offic-0.122
 chatting-0.122
 inviting-0.118
Token,
Token ID11
Feature activation-1.65e-02

Pos logit contributions
kin+0.056
 am+0.054
ses+0.051
chy+0.050
 loser+0.049
Neg logit contributions
 chatting-0.053
 airplanes-0.053
 travelling-0.052
Offic-0.051
 inviting-0.050
Token Lily
Token ID20037
Feature activation-4.75e-02

Pos logit contributions
 am+0.029
ro+0.026
ses+0.025
kin+0.025
ger+0.025
Neg logit contributions
years-0.027
Offic-0.026
 airplanes-0.026
 inviting-0.026
 chatting-0.025
Token!
Token ID0
Feature activation-4.21e-02

Pos logit contributions
 loser+0.080
 am+0.080
kin+0.077
ro+0.074
ses+0.073
Neg logit contributions
 airplanes-0.074
Offic-0.072
 chatting-0.070
 visiting-0.067
 travelling-0.067
Token Where
Token ID6350
Feature activation-1.38e-01

Pos logit contributions
kin+0.074
chy+0.069
ger+0.066
 am+0.066
 lesson+0.063
Neg logit contributions
years-0.067
Offic-0.066
 visiting-0.060
 chatting-0.059
 libraries-0.059
Token is
Token ID318
Feature activation-5.23e-02

Pos logit contributions
 loser+0.180
ro+0.180
ses+0.177
ger+0.172
 fault+0.160
Neg logit contributions
 airplanes-0.262
years-0.257
Offic-0.255
 libraries-0.246
 chatting-0.246
Token the
Token ID262
Feature activation-2.53e-02

Pos logit contributions
 am+0.092
 loser+0.084
kin+0.081
ses+0.080
ro+0.078
Neg logit contributions
years-0.093
 airplanes-0.093
 chatting-0.089
Offic-0.088
 travelling-0.086
Token ball
Token ID2613
Feature activation-2.28e-02

Pos logit contributions
 am+0.041
kin+0.036
ro+0.035
inea+0.034
+0.034
Neg logit contributions
 airplanes-0.046
Offic-0.043
 chatting-0.043
 inviting-0.043
 terminal-0.041
Token?"
Token ID1701
Feature activation-4.09e-03

Pos logit contributions
ses+0.035
 loser+0.033
 fault+0.033
kin+0.032
 am+0.031
Neg logit contributions
 airplanes-0.041
 inviting-0.039
 chatting-0.039
 travelling-0.038
Offic-0.037
Token Tom
Token ID4186
Feature activation-7.39e-03

Pos logit contributions
 am+0.007
 loser+0.006
kin+0.006
ses+0.006
 fault+0.006
Neg logit contributions
years-0.007
Offic-0.007
 chatting-0.007
 airplanes-0.007
 inviting-0.007
Token of
Token ID286
Feature activation-1.48e-02

Pos logit contributions
 am+0.014
 loser+0.012
ses+0.012
 hurts+0.012
kin+0.011
Neg logit contributions
years-0.016
 airplanes-0.016
 chatting-0.015
Offic-0.015
 travelling-0.015
Token them
Token ID606
Feature activation-2.61e-02

Pos logit contributions
 am+0.024
 loser+0.021
kin+0.021
ger+0.019
 thing+0.019
Neg logit contributions
years-0.028
Offic-0.027
 airplanes-0.027
 chatting-0.027
 terminal-0.026
Token asked
Token ID1965
Feature activation+2.84e-02

Pos logit contributions
 am+0.045
 loser+0.036
 hurts+0.036
kin+0.035
ger+0.035
Neg logit contributions
years-0.050
Offic-0.048
 airplanes-0.047
 chatting-0.045
 terminal-0.045
Token,
Token ID11
Feature activation+2.80e-02

Pos logit contributions
years+0.051
Offic+0.049
 airplanes+0.048
 chatting+0.048
 terminal+0.047
Neg logit contributions
 am-0.050
 loser-0.045
ger-0.042
kin-0.042
ses-0.041
Token "
Token ID366
Feature activation+1.10e-02

Pos logit contributions
years+0.054
 airplanes+0.052
Offic+0.051
 chatting+0.051
 terminal+0.049
Neg logit contributions
 am-0.047
 loser-0.041
ses-0.039
kin-0.039
 hurts-0.037
TokenWhere
Token ID8496
Feature activation-1.38e-01

Pos logit contributions
years+0.020
 airplanes+0.019
 chatting+0.019
 terminal+0.018
 travelling+0.018
Neg logit contributions
 am-0.019
 loser-0.018
'm-0.015
ger-0.015
ro-0.015
Token have
Token ID423
Feature activation-6.43e-02

Pos logit contributions
 loser+0.152
ses+0.149
 am+0.145
ro+0.140
kin+0.137
Neg logit contributions
 airplanes-0.307
years-0.303
 chatting-0.295
Offic-0.293
 travelling-0.288
Token you
Token ID345
Feature activation-9.76e-02

Pos logit contributions
 am+0.090
 loser+0.078
 hurts+0.076
ger+0.069
 thing+0.069
Neg logit contributions
years-0.142
Offic-0.133
 chatting-0.133
 airplanes-0.128
 terminal-0.124
Token been
Token ID587
Feature activation-4.98e-02

Pos logit contributions
 am+0.172
 loser+0.148
 hurts+0.142
kin+0.139
ses+0.138
Neg logit contributions
years-0.185
Offic-0.178
 airplanes-0.176
 chatting-0.172
 terminal-0.165
Token?"
Token ID1701
Feature activation+1.36e-03

Pos logit contributions
 am+0.092
 loser+0.082
 hurts+0.078
kin+0.078
ses+0.078
Neg logit contributions
years-0.088
Offic-0.083
 airplanes-0.083
 chatting-0.081
 terminal-0.077
Token The
Token ID383
Feature activation+1.84e-02

Pos logit contributions
years+0.003
 airplanes+0.003
 chatting+0.003
Offic+0.003
 travelling+0.003
Neg logit contributions
 am-0.002
 loser-0.002
ses-0.002
kin-0.002
 hurts-0.002
Token heart
Token ID2612
Feature activation-2.16e-02

Pos logit contributions
 airplanes+0.013
years+0.013
 chatting+0.013
Offic+0.013
 travelling+0.012
Neg logit contributions
 am-0.012
ses-0.010
 loser-0.010
kin-0.010
 hurts-0.010
Token.
Token ID13
Feature activation-1.96e-02

Pos logit contributions
 am+0.032
ses+0.028
 loser+0.028
 hurts+0.026
ger+0.025
Neg logit contributions
 airplanes-0.045
 chatting-0.043
years-0.042
 travelling-0.041
 visiting-0.040
Token
Token ID198
Feature activation-4.79e-04

Pos logit contributions
 am+0.031
 loser+0.027
 hurts+0.027
ses+0.026
kin+0.026
Neg logit contributions
years-0.039
Offic-0.037
 chatting-0.037
 airplanes-0.037
 inviting-0.035
Token
Token ID198
Feature activation-4.77e-03

Pos logit contributions
 am+0.001
 loser+0.001
kin+0.001
 hurts+0.001
ses+0.001
Neg logit contributions
years-0.001
 airplanes-0.001
 chatting-0.001
Offic-0.001
 travelling-0.001
Token"
Token ID1
Feature activation-1.96e-03

Pos logit contributions
 am+0.008
 loser+0.007
kin+0.007
 hurts+0.006
ger+0.006
Neg logit contributions
years-0.009
 airplanes-0.009
 chatting-0.009
 travelling-0.009
 cities-0.008
TokenWhere
Token ID8496
Feature activation-1.38e-01

Pos logit contributions
 am+0.003
 loser+0.003
ses+0.003
kin+0.003
 hurts+0.003
Neg logit contributions
years-0.004
 airplanes-0.003
 chatting-0.003
Offic-0.003
 travelling-0.003
Token is
Token ID318
Feature activation-5.96e-02

Pos logit contributions
 loser+0.157
 am+0.157
ses+0.155
ger+0.155
ro+0.153
Neg logit contributions
years-0.293
 airplanes-0.286
Offic-0.277
 chatting-0.276
 libraries-0.272
Token the
Token ID262
Feature activation-3.79e-02

Pos logit contributions
 am+0.105
 loser+0.093
ses+0.089
kin+0.089
 hurts+0.086
Neg logit contributions
years-0.110
 airplanes-0.108
 chatting-0.105
Offic-0.102
 travelling-0.101
Token heart
Token ID2612
Feature activation-6.04e-02

Pos logit contributions
 am+0.067
ger+0.057
ses+0.056
 hurts+0.056
kin+0.055
Neg logit contributions
 airplanes-0.068
years-0.064
 chatting-0.062
Offic-0.061
 travelling-0.061
Token piece
Token ID3704
Feature activation-3.37e-02

Pos logit contributions
 am+0.085
ses+0.083
 loser+0.080
ger+0.076
 fault+0.073
Neg logit contributions
 airplanes-0.123
 chatting-0.116
years-0.116
 travelling-0.112
 visiting-0.111
Token,
Token ID11
Feature activation+9.49e-04

Pos logit contributions
ses+0.052
 am+0.049
 loser+0.049
 fault+0.045
ro+0.044
Neg logit contributions
 airplanes-0.063
 chatting-0.056
 travelling-0.055
years-0.055
ents-0.054
Token garden
Token ID11376
Feature activation-9.98e-03

Pos logit contributions
 am+0.057
 loser+0.051
ses+0.050
kin+0.049
 hurts+0.049
Neg logit contributions
 airplanes-0.051
years-0.050
 chatting-0.049
Offic-0.047
 travelling-0.047
Token.
Token ID13
Feature activation-2.88e-02

Pos logit contributions
 am+0.014
 loser+0.013
ses+0.012
 hurts+0.012
kin+0.012
Neg logit contributions
years-0.021
 airplanes-0.021
 chatting-0.020
Offic-0.020
 travelling-0.019
Token
Token ID198
Feature activation-9.71e-03

Pos logit contributions
 am+0.044
 loser+0.038
ses+0.037
kin+0.036
 hurts+0.036
Neg logit contributions
years-0.060
 airplanes-0.057
 chatting-0.057
Offic-0.056
 travelling-0.054
Token
Token ID198
Feature activation-1.12e-02

Pos logit contributions
 am+0.014
 loser+0.011
kin+0.010
 hurts+0.010
ger+0.010
Neg logit contributions
years-0.022
 airplanes-0.021
 chatting-0.021
Offic-0.020
 travelling-0.020
Token"
Token ID1
Feature activation+9.48e-04

Pos logit contributions
 am+0.020
 loser+0.018
kin+0.017
'm+0.016
ger+0.016
Neg logit contributions
 airplanes-0.020
years-0.020
 chatting-0.020
 travelling-0.018
 cities-0.018
TokenWhere
Token ID8496
Feature activation-1.38e-01

Pos logit contributions
years+0.002
 airplanes+0.002
 chatting+0.002
Offic+0.002
 travelling+0.002
Neg logit contributions
 am-0.002
 loser-0.001
kin-0.001
ses-0.001
 hurts-0.001
Token are
Token ID389
Feature activation-5.09e-02

Pos logit contributions
 loser+0.183
ses+0.177
ro+0.175
ger+0.167
 am+0.165
Neg logit contributions
 airplanes-0.273
years-0.262
 chatting-0.256
 libraries-0.256
Offic-0.250
Token you
Token ID345
Feature activation-1.06e-01

Pos logit contributions
 am+0.075
 loser+0.065
 hurts+0.062
ses+0.061
kin+0.061
Neg logit contributions
years-0.109
 airplanes-0.104
 chatting-0.103
Offic-0.103
 travelling-0.098
Token going
Token ID1016
Feature activation-5.09e-02

Pos logit contributions
 am+0.203
 loser+0.182
kin+0.173
ses+0.173
 hurts+0.173
Neg logit contributions
years-0.184
Offic-0.174
 airplanes-0.171
 chatting-0.168
 terminal-0.160
Token?"
Token ID1701
Feature activation-1.12e-02

Pos logit contributions
 am+0.070
 loser+0.062
ses+0.060
kin+0.060
 hurts+0.058
Neg logit contributions
years-0.110
 airplanes-0.109
 chatting-0.107
Offic-0.104
 travelling-0.103
Token asked
Token ID1965
Feature activation+2.65e-02

Pos logit contributions
 am+0.019
 loser+0.017
ses+0.016
kin+0.016
 hurts+0.016
Neg logit contributions
years-0.021
 airplanes-0.020
 chatting-0.020
Offic-0.020
 travelling-0.019
Token.
Token ID13
Feature activation-1.85e-02

Pos logit contributions
ses+0.029
 am+0.026
kin+0.026
 loser+0.026
 fault+0.025
Neg logit contributions
 inviting-0.032
 travelling-0.031
 visiting-0.031
 chatting-0.031
years-0.031
Token He
Token ID679
Feature activation-4.77e-04

Pos logit contributions
 am+0.029
 loser+0.025
kin+0.024
ses+0.024
 hurts+0.024
Neg logit contributions
years-0.038
 airplanes-0.036
 chatting-0.036
Offic-0.036
 travelling-0.034
Token shook
Token ID14682
Feature activation+1.20e-02

Pos logit contributions
 am+0.001
 loser+0.001
 hurts+0.001
ses+0.001
kin+0.001
Neg logit contributions
years-0.001
 airplanes-0.001
 chatting-0.001
Offic-0.001
 cities-0.001
Token his
Token ID465
Feature activation+1.72e-02

Pos logit contributions
years+0.022
Offic+0.021
 airplanes+0.021
 chatting+0.020
 terminal+0.020
Neg logit contributions
 am-0.021
 loser-0.019
ses-0.019
 hurts-0.019
kin-0.018
Token head
Token ID1182
Feature activation+2.32e-02

Pos logit contributions
 airplanes+0.032
 chatting+0.030
 travelling+0.030
 inviting+0.029
 fields+0.029
Neg logit contributions
 am-0.030
 loser-0.024
'm-0.023
kin-0.023
ger-0.022
Token and
Token ID290
Feature activation+9.06e-02

Pos logit contributions
years+0.045
Offic+0.043
 airplanes+0.043
 chatting+0.042
 terminal+0.040
Neg logit contributions
 am-0.038
 loser-0.034
ses-0.032
 hurts-0.032
kin-0.032
Token said
Token ID531
Feature activation+3.94e-02

Pos logit contributions
years+0.170
 airplanes+0.157
Offic+0.154
 chatting+0.151
 cities+0.148
Neg logit contributions
 am-0.162
 hurts-0.143
ses-0.142
 loser-0.135
kin-0.134
Token,
Token ID11
Feature activation+3.55e-02

Pos logit contributions
years+0.078
 airplanes+0.074
 chatting+0.073
 travelling+0.070
Offic+0.070
Neg logit contributions
 am-0.063
ses-0.056
 loser-0.056
kin-0.053
 hurts-0.052
Token "
Token ID366
Feature activation+1.46e-02

Pos logit contributions
years+0.074
 airplanes+0.071
Offic+0.069
 chatting+0.069
 terminal+0.068
Neg logit contributions
 am-0.053
 loser-0.046
ses-0.045
 hurts-0.043
kin-0.042
TokenWe
Token ID1135
Feature activation-5.17e-02

Pos logit contributions
years+0.028
Offic+0.026
 airplanes+0.026
 chatting+0.026
 travelling+0.024
Neg logit contributions
 am-0.025
 loser-0.022
ses-0.022
kin-0.022
 hurts-0.021
Token all
Token ID477
Feature activation-1.02e-03

Pos logit contributions
 am+0.066
ses+0.065
kin+0.062
 loser+0.062
ger+0.060
Neg logit contributions
years-0.107
 airplanes-0.105
 chatting-0.103
 travelling-0.099
 talked-0.097
Token paint
Token ID7521
Feature activation+4.17e-02

Pos logit contributions
years+0.048
 airplanes+0.045
 chatting+0.045
Offic+0.044
 travelling+0.042
Neg logit contributions
 am-0.048
 loser-0.043
ses-0.042
kin-0.041
 hurts-0.041
Token and
Token ID290
Feature activation+1.06e-01

Pos logit contributions
 airplanes+0.089
years+0.086
 chatting+0.082
Offic+0.080
 cities+0.080
Neg logit contributions
 am-0.059
 loser-0.055
ses-0.054
 hurts-0.050
kin-0.048
Token paper
Token ID3348
Feature activation+3.88e-02

Pos logit contributions
years+0.212
Offic+0.198
 chatting+0.194
 airplanes+0.186
 terminal+0.185
Neg logit contributions
 am-0.181
 hurts-0.149
 loser-0.145
kin-0.144
ses-0.144
Token.
Token ID13
Feature activation+7.87e-03

Pos logit contributions
 airplanes+0.085
years+0.080
 chatting+0.079
 travelling+0.075
 cities+0.075
Neg logit contributions
 am-0.053
 loser-0.051
ses-0.046
 hurts-0.045
kin-0.043
Token Tim
Token ID5045
Feature activation+1.03e-02

Pos logit contributions
years+0.015
 airplanes+0.014
 chatting+0.014
Offic+0.014
 travelling+0.013
Neg logit contributions
 am-0.014
 loser-0.012
ses-0.012
 hurts-0.012
kin-0.011
Token painted
Token ID13055
Feature activation+5.58e-02

Pos logit contributions
years+0.023
Offic+0.022
 chatting+0.022
 airplanes+0.022
 travelling+0.022
Neg logit contributions
 am-0.014
 loser-0.011
 hurts-0.011
kin-0.011
ses-0.011
Token a
Token ID257
Feature activation+3.89e-02

Pos logit contributions
years+0.127
Offic+0.122
 chatting+0.119
 terminal+0.114
 travelling+0.111
Neg logit contributions
 am-0.078
kin-0.063
 loser-0.061
 thing-0.061
ger-0.058
Token picture
Token ID4286
Feature activation+1.58e-02

Pos logit contributions
years+0.088
 chatting+0.079
Offic+0.077
 travelling+0.075
 airplanes+0.073
Neg logit contributions
 am-0.058
 loser-0.051
kin-0.050
 thing-0.048
 snake-0.046
Token of
Token ID286
Feature activation+3.36e-02

Pos logit contributions
years+0.030
Offic+0.029
 chatting+0.028
 airplanes+0.028
 terminal+0.027
Neg logit contributions
 am-0.027
 loser-0.024
kin-0.023
 hurts-0.023
ses-0.023
Token the
Token ID262
Feature activation+2.14e-02

Pos logit contributions
years+0.082
Offic+0.077
 chatting+0.074
 terminal+0.073
 airplanes+0.071
Neg logit contributions
 am-0.041
 loser-0.031
ses-0.030
 thing-0.030
kin-0.030
Token city
Token ID1748
Feature activation+3.73e-02

Pos logit contributions
years+0.038
 chatting+0.036
Offic+0.035
 airplanes+0.034
 terminal+0.032
Neg logit contributions
 am-0.041
 loser-0.036
 thing-0.034
 hurts-0.033
ger-0.033
Token happy
Token ID3772
Feature activation-1.46e-02

Pos logit contributions
years+0.011
Offic+0.010
 airplanes+0.010
 chatting+0.009
 terminal+0.009
Neg logit contributions
 am-0.010
 loser-0.010
kin-0.009
ses-0.009
 fault-0.009
Token brothers
Token ID9397
Feature activation+5.92e-03

Pos logit contributions
 am+0.021
 loser+0.018
ses+0.017
 hurts+0.017
kin+0.017
Neg logit contributions
years-0.032
 airplanes-0.030
Offic-0.030
 chatting-0.030
 travelling-0.029
Token.
Token ID13
Feature activation-1.49e-02

Pos logit contributions
years+0.014
 airplanes+0.013
 chatting+0.013
Offic+0.013
 travelling+0.013
Neg logit contributions
 am-0.008
 loser-0.006
ses-0.006
kin-0.006
 hurts-0.006
Token<|endoftext|>
Token ID50256
Feature activation-2.85e-02

Pos logit contributions
 am+0.026
kin+0.024
 loser+0.024
 hurts+0.024
ger+0.023
Neg logit contributions
years-0.025
Offic-0.025
 chatting-0.025
 inviting-0.024
 airplanes-0.024
TokenJen
Token ID44875
Feature activation-5.70e-03

Pos logit contributions
ses+0.055
 fault+0.052
 loser+0.051
kin+0.049
chy+0.048
Neg logit contributions
 chatting-0.041
 airplanes-0.041
 forests-0.041
 visiting-0.041
 travelling-0.040
Token and
Token ID290
Feature activation+5.99e-02

Pos logit contributions
 loser+0.011
 am+0.011
kin+0.010
ses+0.010
 hurts+0.009
Neg logit contributions
 airplanes-0.009
years-0.009
 chatting-0.008
Offic-0.008
 visiting-0.008
Token Ben
Token ID3932
Feature activation-2.54e-02

Pos logit contributions
 airplanes+0.093
years+0.090
 chatting+0.090
Offic+0.086
 inviting+0.086
Neg logit contributions
kin-0.111
 loser-0.111
 am-0.109
ro-0.106
 fault-0.106
Token were
Token ID547
Feature activation+1.52e-03

Pos logit contributions
 loser+0.035
 am+0.034
kin+0.033
ses+0.033
 hurts+0.032
Neg logit contributions
years-0.050
 airplanes-0.050
 chatting-0.049
Offic-0.048
 travelling-0.047
Token twins
Token ID20345
Feature activation-5.27e-03

Pos logit contributions
Offic+0.004
years+0.003
 airplanes+0.003
 cities+0.003
 terminal+0.003
Neg logit contributions
 am-0.002
 loser-0.002
ses-0.002
 fault-0.002
kin-0.001
Token who
Token ID508
Feature activation-5.79e-03

Pos logit contributions
 am+0.007
 loser+0.006
ses+0.005
kin+0.005
 hurts+0.005
Neg logit contributions
years-0.012
 airplanes-0.012
 chatting-0.012
Offic-0.012
 travelling-0.011
Token liked
Token ID8288
Feature activation+2.77e-02

Pos logit contributions
 loser+0.012
ses+0.011
 hurts+0.011
 fault+0.011
kin+0.011
Neg logit contributions
years-0.008
 chatting-0.008
 talked-0.007
 airplanes-0.007
 inviting-0.007
Token and
Token ID290
Feature activation+9.39e-02

Pos logit contributions
years+0.071
Offic+0.067
 airplanes+0.067
 chatting+0.066
 terminal+0.064
Neg logit contributions
 am-0.050
 loser-0.043
kin-0.042
 hurts-0.042
ses-0.041
Token they
Token ID484
Feature activation+2.46e-02

Pos logit contributions
years+0.181
 airplanes+0.169
Offic+0.165
 chatting+0.165
 libraries+0.164
Neg logit contributions
 am-0.154
ger-0.137
kin-0.131
 loser-0.131
ses-0.129
Token soon
Token ID2582
Feature activation+2.47e-03

Pos logit contributions
 libraries+0.047
Offic+0.047
 cities+0.046
irens+0.045
years+0.045
Neg logit contributions
 am-0.036
kin-0.034
ger-0.032
ny-0.031
chy-0.028
Token found
Token ID1043
Feature activation+3.85e-02

Pos logit contributions
years+0.005
Offic+0.004
 airplanes+0.004
 chatting+0.004
 libraries+0.004
Neg logit contributions
 am-0.004
ger-0.004
kin-0.004
 loser-0.004
 hurts-0.004
Token many
Token ID867
Feature activation+4.36e-02

Pos logit contributions
years+0.071
 airplanes+0.068
 chatting+0.066
Offic+0.065
 travelling+0.064
Neg logit contributions
 am-0.067
 loser-0.062
kin-0.060
 hurts-0.059
ses-0.058
Token ac
Token ID936
Feature activation+6.43e-02

Pos logit contributions
years+0.097
Offic+0.096
 chatting+0.092
 terminal+0.091
Kelly+0.087
Neg logit contributions
 am-0.062
ger-0.047
 hurts-0.047
 thing-0.047
 is-0.044
Tokenorns
Token ID19942
Feature activation+1.96e-02

Pos logit contributions
years+0.099
 chatting+0.095
 airplanes+0.092
Offic+0.092
 terminal+0.087
Neg logit contributions
 am-0.135
 loser-0.118
 hurts-0.116
ger-0.115
kin-0.114
Token together
Token ID1978
Feature activation-1.88e-02

Pos logit contributions
years+0.044
Offic+0.042
 airplanes+0.042
 chatting+0.041
 terminal+0.040
Neg logit contributions
 am-0.027
 loser-0.022
kin-0.022
ses-0.021
 hurts-0.021
Token.
Token ID13
Feature activation+1.34e-03

Pos logit contributions
 am+0.016
kin+0.011
 loser+0.010
ger+0.010
ses+0.009
Neg logit contributions
years-0.053
Offic-0.052
 airplanes-0.050
 terminal-0.049
 chatting-0.049
Token But
Token ID887
Feature activation-1.93e-02

Pos logit contributions
years+0.003
 airplanes+0.003
 chatting+0.003
Offic+0.003
 travelling+0.003
Neg logit contributions
 am-0.002
ses-0.002
 loser-0.002
ger-0.001
 hurts-0.001
Token,
Token ID11
Feature activation+3.76e-02

Pos logit contributions
 am+0.033
 loser+0.029
ses+0.028
kin+0.027
 hurts+0.027
Neg logit contributions
years-0.037
 airplanes-0.036
Offic-0.035
 chatting-0.035
 travelling-0.033
Token A
Token ID317
Feature activation+1.58e-02

Pos logit contributions
 am+0.000
 loser+0.000
kin+0.000
ses+0.000
 hurts+0.000
Neg logit contributions
years-0.001
 airplanes-0.001
 chatting-0.001
Offic-0.001
 travelling-0.001
Token man
Token ID582
Feature activation+3.53e-02

Pos logit contributions
years+0.032
 airplanes+0.030
 chatting+0.030
Offic+0.030
 travelling+0.029
Neg logit contributions
 am-0.025
 loser-0.022
kin-0.021
ses-0.021
 hurts-0.021
Token with
Token ID351
Feature activation+3.62e-02

Pos logit contributions
years+0.081
 airplanes+0.078
 chatting+0.077
Offic+0.077
 travelling+0.073
Neg logit contributions
 am-0.047
 loser-0.039
ses-0.037
 hurts-0.037
kin-0.037
Token a
Token ID257
Feature activation+3.22e-02

Pos logit contributions
years+0.086
Offic+0.082
 chatting+0.082
 airplanes+0.080
 terminal+0.078
Neg logit contributions
 am-0.044
 loser-0.037
ses-0.036
 hurts-0.035
kin-0.034
Token hat
Token ID6877
Feature activation+2.83e-02

Pos logit contributions
years+0.068
 airplanes+0.061
 chatting+0.061
Offic+0.061
 travelling+0.058
Neg logit contributions
 am-0.052
 loser-0.045
 hurts-0.043
ses-0.043
kin-0.043
Token and
Token ID290
Feature activation+1.04e-01

Pos logit contributions
years+0.059
 airplanes+0.055
Offic+0.054
 chatting+0.054
 travelling+0.051
Neg logit contributions
 am-0.045
 loser-0.039
ses-0.037
 hurts-0.037
kin-0.037
Token a
Token ID257
Feature activation+3.28e-02

Pos logit contributions
years+0.212
 airplanes+0.207
Offic+0.203
 chatting+0.203
 travelling+0.193
Neg logit contributions
 am-0.160
 loser-0.142
kin-0.136
ses-0.132
 hurts-0.131
Token a
Token ID257
Feature activation+2.82e-02

Pos logit contributions
years+0.062
 airplanes+0.058
 chatting+0.058
Offic+0.057
 travelling+0.054
Neg logit contributions
 am-0.057
 loser-0.051
ses-0.049
kin-0.048
 hurts-0.048
Tokenpron
Token ID31186
Feature activation-1.27e-02

Pos logit contributions
years+0.050
 airplanes+0.049
 chatting+0.048
Offic+0.048
 travelling+0.045
Neg logit contributions
 am-0.050
 loser-0.045
ses-0.043
kin-0.042
 hurts-0.042
Token is
Token ID318
Feature activation-5.09e-03

Pos logit contributions
 am+0.017
 loser+0.017
 fault+0.015
ses+0.014
 hurts+0.014
Neg logit contributions
 chatting-0.026
 airplanes-0.025
 travelling-0.024
 visiting-0.024
Offic-0.024
Token inside
Token ID2641
Feature activation-2.41e-02

Pos logit contributions
 am+0.010
 loser+0.009
ses+0.008
kin+0.008
 hurts+0.008
Neg logit contributions
years-0.009
 airplanes-0.008
 chatting-0.008
Offic-0.008
 travelling-0.008
Token unlocked
Token ID14838
Feature activation+2.31e-02

Pos logit contributions
 am+0.061
 hurts+0.044
kin+0.042
ses+0.041
ger+0.041
Neg logit contributions
years-0.122
Offic-0.119
 airplanes-0.114
 chatting-0.113
 terminal-0.109
Token trees
Token ID7150
Feature activation+2.93e-02

Pos logit contributions
years+0.051
Offic+0.048
 chatting+0.047
 airplanes+0.046
 terminal+0.044
Neg logit contributions
 am-0.033
 loser-0.027
kin-0.027
 hurts-0.026
ger-0.026
Token,
Token ID11
Feature activation+5.51e-02

Pos logit contributions
years+0.067
Offic+0.064
 airplanes+0.060
 chatting+0.058
 terminal+0.058
Neg logit contributions
 am-0.041
 loser-0.031
 hurts-0.031
ses-0.031
ro-0.030
Token and
Token ID290
Feature activation+8.77e-02

Pos logit contributions
years+0.116
Offic+0.111
irs+0.106
 terminal+0.101
 chatting+0.101
Neg logit contributions
 am-0.077
 is-0.065
 thing-0.063
ro-0.062
 hurts-0.061
Token lots
Token ID6041
Feature activation+4.91e-02

Pos logit contributions
years+0.155
Offic+0.155
 chatting+0.145
irs+0.141
 terminal+0.138
Neg logit contributions
 am-0.146
 thing-0.123
 hurts-0.119
ger-0.117
 loser-0.115
Token of
Token ID286
Feature activation+1.85e-02

Pos logit contributions
years+0.104
Offic+0.100
 chatting+0.095
 airplanes+0.095
 terminal+0.093
Neg logit contributions
 am-0.074
 loser-0.061
ger-0.060
 hurts-0.059
kin-0.059
Token other
Token ID584
Feature activation-9.59e-03

Pos logit contributions
Offic+0.036
years+0.035
 chatting+0.034
 terminal+0.032
 airplanes+0.032
Neg logit contributions
 am-0.030
 thing-0.025
ger-0.024
 hurts-0.024
 is-0.023
Token things
Token ID1243
Feature activation-2.92e-02

Pos logit contributions
 am+0.015
 thing+0.012
 hurts+0.012
 is+0.012
 loser+0.012
Neg logit contributions
years-0.020
Offic-0.020
 chatting-0.018
 terminal-0.017
 airplanes-0.017
Token too
Token ID1165
Feature activation-7.81e-02

Pos logit contributions
 am+0.038
 loser+0.032
ses+0.030
kin+0.030
 hurts+0.030
Neg logit contributions
years-0.067
 airplanes-0.064
Offic-0.064
 chatting-0.064
 travelling-0.061
Token.
Token ID13
Feature activation-1.55e-02

Pos logit contributions
 am+0.082
 loser+0.063
 hurts+0.060
kin+0.060
ses+0.059
Neg logit contributions
years-0.203
Offic-0.194
 airplanes-0.192
 chatting-0.191
 terminal-0.183
Token
Token ID220
Feature activation-1.64e-02

Pos logit contributions
 am+0.019
ses+0.015
 loser+0.015
 hurts+0.014
ger+0.014
Neg logit contributions
years-0.037
 airplanes-0.036
Offic-0.035
 chatting-0.035
 terminal-0.033
Token It
Token ID632
Feature activation-4.54e-02

Pos logit contributions
 am+0.010
 loser+0.009
kin+0.008
ses+0.008
 hurts+0.008
Neg logit contributions
years-0.014
 airplanes-0.013
 chatting-0.013
Offic-0.013
 travelling-0.013
Token was
Token ID373
Feature activation+4.07e-03

Pos logit contributions
 loser+0.070
 am+0.068
 fault+0.064
ses+0.062
kin+0.061
Neg logit contributions
 airplanes-0.087
 chatting-0.086
years-0.086
Offic-0.084
 inviting-0.083
Token a
Token ID257
Feature activation+4.83e-03

Pos logit contributions
years+0.009
Offic+0.008
 airplanes+0.008
 chatting+0.008
 terminal+0.008
Neg logit contributions
 am-0.006
 loser-0.005
ses-0.005
kin-0.005
 hurts-0.005
Token gen
Token ID2429
Feature activation+6.04e-02

Pos logit contributions
years+0.009
 airplanes+0.009
 chatting+0.008
Offic+0.008
 inviting+0.008
Neg logit contributions
 am-0.008
 loser-0.007
ses-0.007
 hurts-0.007
kin-0.007
Tokenie
Token ID494
Feature activation+5.48e-03

Pos logit contributions
years+0.098
 airplanes+0.092
Offic+0.091
 chatting+0.091
 terminal+0.085
Neg logit contributions
 am-0.121
 loser-0.109
kin-0.106
 hurts-0.104
ses-0.103
Token!
Token ID0
Feature activation-1.29e-02

Pos logit contributions
 airplanes+0.011
years+0.011
 chatting+0.011
 travelling+0.011
Offic+0.010
Neg logit contributions
 am-0.008
 loser-0.007
ses-0.007
 hurts-0.007
kin-0.007
Token
Token ID198
Feature activation+9.07e-03

Pos logit contributions
 am+0.020
 loser+0.018
kin+0.017
ses+0.017
 hurts+0.017
Neg logit contributions
years-0.026
Offic-0.025
 airplanes-0.025
 chatting-0.025
 terminal-0.023
Token
Token ID198
Feature activation+2.52e-03

Pos logit contributions
years+0.019
 airplanes+0.019
 chatting+0.018
Offic+0.018
 travelling+0.018
Neg logit contributions
 am-0.013
 loser-0.012
ses-0.011
kin-0.011
 hurts-0.011
TokenThe
Token ID464
Feature activation+9.60e-03

Pos logit contributions
years+0.005
 airplanes+0.005
 chatting+0.005
 travelling+0.005
Offic+0.005
Neg logit contributions
 am-0.004
 loser-0.003
kin-0.003
 hurts-0.003
ger-0.003
Token gen
Token ID2429
Feature activation+6.98e-02

Pos logit contributions
years+0.020
 airplanes+0.020
 chatting+0.019
Offic+0.019
 travelling+0.019
Neg logit contributions
 am-0.014
 loser-0.012
ses-0.012
kin-0.012
 hurts-0.011
Tokenie
Token ID494
Feature activation+1.63e-02

Pos logit contributions
years+0.116
Offic+0.111
 airplanes+0.110
 chatting+0.109
 terminal+0.104
Neg logit contributions
 am-0.138
kin-0.120
 loser-0.118
 hurts-0.118
ger-0.111
Token ever
Token ID1683
Feature activation-3.15e-02

Pos logit contributions
 am+0.023
 loser+0.021
ses+0.020
kin+0.020
 hurts+0.020
Neg logit contributions
 chatting-0.018
years-0.018
 airplanes-0.017
 travelling-0.017
 inviting-0.017
Token the
Token ID262
Feature activation-1.93e-02

Pos logit contributions
 am+0.062
 loser+0.057
ses+0.056
kin+0.054
 hurts+0.053
Neg logit contributions
years-0.051
 chatting-0.050
 airplanes-0.049
 travelling-0.047
 inviting-0.046
Token same
Token ID976
Feature activation-3.04e-02

Pos logit contributions
 am+0.032
ses+0.027
kin+0.026
 hurts+0.025
ro+0.025
Neg logit contributions
 airplanes-0.036
 travelling-0.035
 chatting-0.035
 inviting-0.034
years-0.034
Token again
Token ID757
Feature activation-7.73e-02

Pos logit contributions
 am+0.044
ses+0.042
 loser+0.038
 hurts+0.038
kin+0.036
Neg logit contributions
 chatting-0.060
 travelling-0.057
 airplanes-0.057
 visiting-0.055
 inviting-0.054
Token.
Token ID13
Feature activation-9.64e-03

Pos logit contributions
 am+0.091
 loser+0.084
ses+0.082
 hurts+0.076
kin+0.075
Neg logit contributions
years-0.178
 chatting-0.176
 airplanes-0.172
 travelling-0.167
Offic-0.166
Token<|endoftext|>
Token ID50256
Feature activation-1.73e-02

Pos logit contributions
 am+0.016
 loser+0.015
kin+0.015
ses+0.014
 hurts+0.014
Neg logit contributions
years-0.018
 chatting-0.017
Offic-0.017
 airplanes-0.017
 inviting-0.016
TokenOnce
Token ID7454
Feature activation-5.95e-03

Pos logit contributions
ses+0.034
 loser+0.033
 fault+0.031
 hurts+0.031
kin+0.030
Neg logit contributions
 chatting-0.026
 travelling-0.024
years-0.024
 visiting-0.024
 airplanes-0.024
Token upon
Token ID2402
Feature activation-2.97e-02

Pos logit contributions
 loser+0.014
 am+0.014
ses+0.013
 hurts+0.013
ger+0.013
Neg logit contributions
years-0.007
 airplanes-0.006
 chatting-0.006
 travelling-0.006
 inviting-0.006
Token a
Token ID257
Feature activation-2.74e-02

Pos logit contributions
 am+0.048
 loser+0.047
ses+0.046
 hurts+0.043
kin+0.043
Neg logit contributions
years-0.055
 airplanes-0.053
 chatting-0.053
Offic-0.051
 inviting-0.051
Token time
Token ID640
Feature activation-6.35e-02

Pos logit contributions
 am+0.038
 loser+0.035
ses+0.033
 hurts+0.032
kin+0.031
Neg logit contributions
years-0.057
 chatting-0.056
 airplanes-0.054
Offic-0.054
 inviting-0.054
Token,
Token ID11
Feature activation-1.03e-03

Pos logit contributions
 am+0.100
 loser+0.097
ses+0.095
 hurts+0.092
kin+0.089
Neg logit contributions
years-0.116
 chatting-0.114
 travelling-0.110
 airplanes-0.108
 visiting-0.108
Token clips
Token ID19166
Feature activation+5.07e-02

Pos logit contributions
 airplanes+0.058
years+0.056
 chatting+0.056
 travelling+0.053
Offic+0.052
Neg logit contributions
 am-0.038
 loser-0.033
ses-0.031
 hurts-0.030
kin-0.030
Token.
Token ID13
Feature activation+9.52e-04

Pos logit contributions
 airplanes+0.096
years+0.093
 chatting+0.092
 travelling+0.087
ents+0.086
Neg logit contributions
 am-0.083
 loser-0.077
ses-0.075
kin-0.071
 hurts-0.070
Token
Token ID198
Feature activation+2.02e-02

Pos logit contributions
years+0.002
 airplanes+0.002
 chatting+0.002
Offic+0.002
 terminal+0.002
Neg logit contributions
 am-0.002
ses-0.001
 hurts-0.001
 loser-0.001
'm-0.001
Token
Token ID198
Feature activation+1.28e-02

Pos logit contributions
years+0.049
 airplanes+0.047
 chatting+0.046
Offic+0.045
 terminal+0.044
Neg logit contributions
 am-0.026
 loser-0.020
 hurts-0.019
kin-0.018
'm-0.018
TokenWhen
Token ID2215
Feature activation-2.49e-02

Pos logit contributions
years+0.022
 airplanes+0.021
 chatting+0.021
 terminal+0.019
 travelling+0.019
Neg logit contributions
 am-0.026
'm-0.021
 loser-0.021
ger-0.020
 thing-0.020
Token he
Token ID339
Feature activation-1.11e-02

Pos logit contributions
 am+0.041
 loser+0.035
kin+0.033
 hurts+0.033
ses+0.033
Neg logit contributions
years-0.051
Offic-0.048
 airplanes-0.047
 chatting-0.047
 terminal-0.045
Token finished
Token ID5201
Feature activation-9.33e-04

Pos logit contributions
 am+0.018
 loser+0.015
kin+0.015
 hurts+0.014
ses+0.014
Neg logit contributions
years-0.023
 airplanes-0.022
 chatting-0.021
Offic-0.021
 travelling-0.020
Token,
Token ID11
Feature activation+3.37e-02

Pos logit contributions
 am+0.001
kin+0.001
 loser+0.001
 hurts+0.001
ses+0.001
Neg logit contributions
years-0.002
Offic-0.002
 airplanes-0.002
 terminal-0.002
 chatting-0.002
Token Joe
Token ID5689
Feature activation+1.74e-02

Pos logit contributions
years+0.066
Offic+0.062
 airplanes+0.062
 chatting+0.062
 terminal+0.058
Neg logit contributions
 am-0.057
 loser-0.049
 hurts-0.047
kin-0.047
ses-0.047
Token stepped
Token ID10764
Feature activation+5.53e-02

Pos logit contributions
years+0.037
Offic+0.035
 airplanes+0.035
 chatting+0.035
 travelling+0.033
Neg logit contributions
 am-0.026
 loser-0.022
 hurts-0.021
kin-0.021
ses-0.021
Token back
Token ID736
Feature activation+5.86e-03

Pos logit contributions
years+0.125
Offic+0.121
 airplanes+0.121
 chatting+0.119
 travelling+0.114
Neg logit contributions
 am-0.075
 loser-0.064
kin-0.060
 hurts-0.060
ses-0.059
Token she
Token ID673
Feature activation+1.92e-02

Pos logit contributions
years+0.170
 airplanes+0.160
 chatting+0.158
Offic+0.157
 travelling+0.148
Neg logit contributions
 am-0.197
 loser-0.177
ses-0.171
kin-0.170
 hurts-0.170
Token stayed
Token ID9658
Feature activation+3.23e-02

Pos logit contributions
years+0.033
Offic+0.031
 airplanes+0.031
 chatting+0.031
 travelling+0.029
Neg logit contributions
 am-0.036
 loser-0.032
 hurts-0.031
ses-0.031
kin-0.031
Token there
Token ID612
Feature activation-3.18e-02

Pos logit contributions
years+0.069
Offic+0.068
 airplanes+0.061
 terminal+0.060
 libraries+0.060
Neg logit contributions
 am-0.048
kin-0.041
 loser-0.040
 thing-0.038
 hurts-0.037
Token for
Token ID329
Feature activation-2.03e-02

Pos logit contributions
 am+0.048
 loser+0.041
kin+0.040
ses+0.039
 hurts+0.038
Neg logit contributions
years-0.067
Offic-0.066
 airplanes-0.063
 terminal-0.061
 chatting-0.061
Token quite
Token ID2407
Feature activation-2.18e-03

Pos logit contributions
 am+0.035
 loser+0.031
kin+0.029
ses+0.029
 hurts+0.028
Neg logit contributions
years-0.038
 airplanes-0.037
Offic-0.036
 chatting-0.036
 terminal-0.034
Token some
Token ID617
Feature activation-3.21e-03

Pos logit contributions
 am+0.004
 loser+0.003
kin+0.003
ses+0.003
 hurts+0.003
Neg logit contributions
years-0.004
 chatting-0.004
 airplanes-0.004
 travelling-0.004
Offic-0.004
Token time
Token ID640
Feature activation-2.86e-02

Pos logit contributions
 am+0.006
 loser+0.005
 hurts+0.005
kin+0.005
ses+0.005
Neg logit contributions
years-0.006
Offic-0.006
 airplanes-0.006
 chatting-0.005
 terminal-0.005
Token,
Token ID11
Feature activation+4.45e-02

Pos logit contributions
 am+0.031
 loser+0.026
kin+0.024
ses+0.024
 hurts+0.023
Neg logit contributions
years-0.071
Offic-0.069
 airplanes-0.068
 chatting-0.067
 terminal-0.065
Token playing
Token ID2712
Feature activation+1.47e-02

Pos logit contributions
years+0.075
Offic+0.071
 airplanes+0.069
 terminal+0.066
 libraries+0.062
Neg logit contributions
 am-0.086
 loser-0.075
ses-0.074
 hurts-0.071
 fault-0.071
Token with
Token ID351
Feature activation+1.46e-02

Pos logit contributions
years+0.028
 airplanes+0.027
 chatting+0.027
Offic+0.027
 travelling+0.025
Neg logit contributions
 am-0.025
 loser-0.022
ses-0.021
kin-0.021
 hurts-0.021
Token the
Token ID262
Feature activation+4.08e-03

Pos logit contributions
years+0.026
Offic+0.024
 airplanes+0.024
 chatting+0.024
 terminal+0.022
Neg logit contributions
 am-0.027
 loser-0.024
kin-0.023
 hurts-0.023
ses-0.023
Tokenman
Token ID805
Feature activation+1.48e-03

Pos logit contributions
 am+0.024
 loser+0.022
ses+0.021
+0.020
kin+0.020
Neg logit contributions
 airplanes-0.026
 chatting-0.022
 travelling-0.021
 fields-0.021
 cities-0.021
Token soup
Token ID17141
Feature activation-2.86e-02

Pos logit contributions
 airplanes+0.003
years+0.003
 chatting+0.003
Offic+0.003
 travelling+0.003
Neg logit contributions
 am-0.002
ses-0.002
 loser-0.002
kin-0.002
 hurts-0.002
Token.
Token ID13
Feature activation-1.74e-02

Pos logit contributions
 am+0.039
ses+0.034
 loser+0.034
 hurts+0.032
kin+0.030
Neg logit contributions
 airplanes-0.062
 chatting-0.060
years-0.059
Offic-0.056
 travelling-0.056
Token They
Token ID1119
Feature activation-2.92e-03

Pos logit contributions
 am+0.029
 loser+0.026
kin+0.026
 hurts+0.025
ses+0.025
Neg logit contributions
years-0.032
Offic-0.031
 chatting-0.031
 airplanes-0.030
 inviting-0.030
Token wish
Token ID4601
Feature activation-1.47e-02

Pos logit contributions
 am+0.004
 loser+0.004
kin+0.004
ses+0.004
 fault+0.003
Neg logit contributions
 airplanes-0.006
years-0.006
 chatting-0.006
Offic-0.006
 terminal-0.006
Token they
Token ID484
Feature activation-2.70e-02

Pos logit contributions
 am+0.030
 loser+0.028
ses+0.026
 hurts+0.026
kin+0.026
Neg logit contributions
years-0.023
 airplanes-0.022
 chatting-0.022
Offic-0.021
 travelling-0.020
Token had
Token ID550
Feature activation-3.23e-02

Pos logit contributions
 am+0.045
ses+0.043
 loser+0.043
kin+0.039
 hurts+0.039
Neg logit contributions
years-0.049
 airplanes-0.048
 chatting-0.047
Offic-0.045
 travelling-0.044
Token given
Token ID1813
Feature activation+2.84e-02

Pos logit contributions
 am+0.054
 loser+0.047
kin+0.046
ses+0.046
 hurts+0.046
Neg logit contributions
years-0.064
Offic-0.060
 airplanes-0.059
 chatting-0.058
 terminal-0.056
Token him
Token ID683
Feature activation-1.88e-02

Pos logit contributions
 airplanes+0.046
years+0.046
 chatting+0.043
Offic+0.042
 cities+0.041
Neg logit contributions
 am-0.055
 loser-0.051
kin-0.049
ses-0.048
 hurts-0.047
Token something
Token ID1223
Feature activation-6.09e-02

Pos logit contributions
 am+0.033
 loser+0.031
ses+0.030
kin+0.029
 hurts+0.028
Neg logit contributions
 airplanes-0.034
years-0.033
 chatting-0.031
 travelling-0.030
 cities-0.030
Token icy
Token ID30284
Feature activation-4.55e-03

Pos logit contributions
 am+0.115
 loser+0.115
ses+0.111
kin+0.104
ro+0.101
Neg logit contributions
 airplanes-0.099
 inviting-0.094
 chatting-0.094
years-0.091
 travelling-0.090
Token not
Token ID407
Feature activation-5.36e-02

Pos logit contributions
 am+0.069
 loser+0.065
ses+0.065
kin+0.061
ro+0.060
Neg logit contributions
 airplanes-0.072
years-0.070
 chatting-0.069
 inviting-0.067
Offic-0.067
Token hide
Token ID7808
Feature activation+2.39e-03

Pos logit contributions
 am+0.083
ses+0.080
kin+0.078
 loser+0.077
ger+0.073
Neg logit contributions
 chatting-0.099
 airplanes-0.096
years-0.096
 inviting-0.094
 travelling-0.093
Token magnets
Token ID37446
Feature activation-2.08e-02

Pos logit contributions
years+0.004
 airplanes+0.004
 chatting+0.004
 cities+0.004
 inviting+0.004
Neg logit contributions
 am-0.004
ses-0.004
 loser-0.004
 hurts-0.004
'm-0.003
Token in
Token ID287
Feature activation-2.52e-02

Pos logit contributions
 am+0.031
ses+0.029
 hurts+0.026
 loser+0.025
adia+0.025
Neg logit contributions
 airplanes-0.042
years-0.040
 chatting-0.039
 travelling-0.038
ents-0.037
Token the
Token ID262
Feature activation+5.73e-03

Pos logit contributions
 am+0.040
 loser+0.035
kin+0.033
 hurts+0.033
ses+0.032
Neg logit contributions
years-0.052
Offic-0.049
 chatting-0.049
 airplanes-0.049
 travelling-0.046
Token salad
Token ID20698
Feature activation-3.22e-02

Pos logit contributions
 airplanes+0.007
 fields+0.007
 cities+0.007
 shops+0.007
years+0.007
Neg logit contributions
 am-0.012
ses-0.012
 loser-0.011
kin-0.011
 fault-0.011
Token.
Token ID13
Feature activation-1.80e-02

Pos logit contributions
 am+0.040
ses+0.035
 loser+0.034
 hurts+0.032
kin+0.030
Neg logit contributions
 airplanes-0.072
years-0.072
 chatting-0.071
Offic-0.068
 travelling-0.067
Token They
Token ID1119
Feature activation-2.21e-03

Pos logit contributions
 am+0.027
kin+0.024
 hurts+0.023
 loser+0.023
ses+0.023
Neg logit contributions
years-0.037
Offic-0.036
 chatting-0.034
 airplanes-0.034
 inviting-0.033
Token are
Token ID389
Feature activation+1.14e-02

Pos logit contributions
 am+0.002
 loser+0.002
ses+0.002
kin+0.002
 hurts+0.002
Neg logit contributions
years-0.005
 airplanes-0.005
 chatting-0.005
Offic-0.005
 travelling-0.005
Token not
Token ID407
Feature activation-5.46e-02

Pos logit contributions
years+0.023
 airplanes+0.022
Offic+0.021
 chatting+0.021
 travelling+0.020
Neg logit contributions
 am-0.019
 loser-0.017
ses-0.016
kin-0.016
 hurts-0.016
Token good
Token ID922
Feature activation-5.23e-02

Pos logit contributions
 am+0.101
 loser+0.090
ses+0.087
kin+0.086
 hurts+0.086
Neg logit contributions
years-0.097
 airplanes-0.092
 chatting-0.091
Offic-0.088
 travelling-0.086
Token,
Token ID11
Feature activation+2.73e-02

Pos logit contributions
years+0.043
 airplanes+0.041
 chatting+0.041
Offic+0.039
 travelling+0.038
Neg logit contributions
 am-0.034
 loser-0.030
ses-0.029
 hurts-0.029
kin-0.028
Token "
Token ID366
Feature activation+8.21e-03

Pos logit contributions
years+0.058
 airplanes+0.054
Offic+0.053
 chatting+0.052
 terminal+0.051
Neg logit contributions
 am-0.041
 loser-0.035
ses-0.034
 hurts-0.033
kin-0.032
TokenThat
Token ID2504
Feature activation-6.56e-02

Pos logit contributions
years+0.015
Offic+0.014
 airplanes+0.014
 chatting+0.014
 travelling+0.013
Neg logit contributions
 am-0.014
 loser-0.013
ses-0.013
kin-0.012
 hurts-0.012
Token's
Token ID338
Feature activation-3.61e-02

Pos logit contributions
 am+0.103
 loser+0.102
kin+0.097
ses+0.097
ger+0.092
Neg logit contributions
 airplanes-0.123
Offic-0.112
 chatting-0.110
 cities-0.106
 talked-0.106
Token just
Token ID655
Feature activation-6.73e-02

Pos logit contributions
 am+0.065
 loser+0.060
ses+0.058
 hurts+0.057
ger+0.056
Neg logit contributions
years-0.062
 inviting-0.061
 airplanes-0.059
 chatting-0.059
 travelling-0.058
Token a
Token ID257
Feature activation-2.44e-02

Pos logit contributions
 am+0.114
 loser+0.105
kin+0.098
ses+0.097
 hurts+0.097
Neg logit contributions
years-0.125
 chatting-0.120
 airplanes-0.119
 travelling-0.116
 inviting-0.115
Token pretend
Token ID16614
Feature activation-4.36e-02

Pos logit contributions
 am+0.043
 loser+0.038
ses+0.038
kin+0.037
 hurts+0.037
Neg logit contributions
years-0.043
 airplanes-0.042
 chatting-0.041
Offic-0.041
 travelling-0.039
Token monster
Token ID9234
Feature activation-6.66e-02

Pos logit contributions
 am+0.079
 loser+0.071
ses+0.068
kin+0.068
 hurts+0.067
Neg logit contributions
years-0.076
 airplanes-0.074
 chatting-0.073
Offic-0.072
 travelling-0.068
Token,
Token ID11
Feature activation+5.55e-03

Pos logit contributions
 am+0.094
 loser+0.089
ses+0.088
kin+0.086
 hurts+0.079
Neg logit contributions
 airplanes-0.132
 chatting-0.129
years-0.126
 travelling-0.124
 visiting-0.123
Token Tim
Token ID5045
Feature activation-2.01e-02

Pos logit contributions
years+0.011
 airplanes+0.010
 chatting+0.010
Offic+0.010
 travelling+0.010
Neg logit contributions
 am-0.009
 loser-0.008
ses-0.008
kin-0.008
 hurts-0.008
Tokenmy
Token ID1820
Feature activation-1.49e-02

Pos logit contributions
 am+0.049
ses+0.047
 loser+0.046
kin+0.045
 hurts+0.042
Neg logit contributions
years-0.020
 airplanes-0.020
 chatting-0.019
Offic-0.017
 visiting-0.017
Token are
Token ID389
Feature activation-1.08e-02

Pos logit contributions
 am+0.102
 loser+0.088
ses+0.085
kin+0.083
 hurts+0.083
Neg logit contributions
years-0.171
 airplanes-0.164
 chatting-0.162
Offic-0.161
 travelling-0.155
Token very
Token ID845
Feature activation-9.49e-03

Pos logit contributions
 am+0.018
 loser+0.016
ses+0.016
 hurts+0.015
kin+0.015
Neg logit contributions
years-0.021
 airplanes-0.020
 chatting-0.020
Offic-0.020
 travelling-0.019
Token lucky
Token ID9670
Feature activation-3.44e-02

Pos logit contributions
 am+0.017
 loser+0.015
ses+0.015
kin+0.015
 hurts+0.015
Neg logit contributions
years-0.017
 airplanes-0.016
 chatting-0.016
Offic-0.016
 travelling-0.015
Token to
Token ID284
Feature activation-3.43e-02

Pos logit contributions
 am+0.061
 loser+0.054
ses+0.054
kin+0.053
 hurts+0.051
Neg logit contributions
 airplanes-0.061
years-0.061
 chatting-0.061
 travelling-0.058
 inviting-0.057
Token be
Token ID307
Feature activation-2.10e-02

Pos logit contributions
 am+0.042
 loser+0.037
ses+0.036
kin+0.035
 hurts+0.035
Neg logit contributions
years-0.080
Offic-0.077
 airplanes-0.077
 chatting-0.077
 travelling-0.074
Token able
Token ID1498
Feature activation-7.24e-02

Pos logit contributions
 am+0.036
 loser+0.032
kin+0.030
 hurts+0.029
ger+0.028
Neg logit contributions
years-0.039
Offic-0.038
 airplanes-0.038
 chatting-0.037
 terminal-0.037
Token to
Token ID284
Feature activation-2.39e-02

Pos logit contributions
 am+0.124
 loser+0.112
kin+0.110
ses+0.109
 hurts+0.104
Neg logit contributions
years-0.137
 airplanes-0.131
 chatting-0.131
Offic-0.124
 travelling-0.124
Token read
Token ID1100
Feature activation+2.10e-02

Pos logit contributions
 am+0.053
 loser+0.047
 hurts+0.046
kin+0.045
ses+0.045
Neg logit contributions
years-0.035
 airplanes-0.032
 chatting-0.032
Offic-0.031
 travelling-0.029
Token.
Token ID13
Feature activation-2.59e-02

Pos logit contributions
years+0.043
 chatting+0.041
Offic+0.040
 airplanes+0.040
 terminal+0.038
Neg logit contributions
 am-0.033
 loser-0.028
 hurts-0.027
kin-0.027
ses-0.026
Token Not
Token ID1892
Feature activation-3.67e-02

Pos logit contributions
 am+0.039
kin+0.035
 loser+0.035
ses+0.034
 hurts+0.034
Neg logit contributions
years-0.053
Offic-0.051
 chatting-0.049
 airplanes-0.048
 travelling-0.048
Token all
Token ID477
Feature activation-8.38e-03

Pos logit contributions
 am+0.065
 loser+0.057
 hurts+0.055
ses+0.054
kin+0.054
Neg logit contributions
years-0.068
 airplanes-0.065
Offic-0.065
 chatting-0.064
 terminal-0.060
Tokenumpy
Token ID32152
Feature activation-3.11e-02

Pos logit contributions
 am+0.007
ses+0.005
 loser+0.005
 hurts+0.005
 fault+0.005
Neg logit contributions
years-0.012
 airplanes-0.011
Offic-0.011
 chatting-0.011
 terminal-0.010
Token fisherman
Token ID48750
Feature activation+9.03e-03

Pos logit contributions
 am+0.054
 loser+0.046
 hurts+0.044
 fault+0.044
kin+0.044
Neg logit contributions
 airplanes-0.056
years-0.056
Offic-0.054
 chatting-0.054
 travelling-0.053
Token.
Token ID13
Feature activation-4.04e-02

Pos logit contributions
years+0.016
 airplanes+0.015
 chatting+0.015
Offic+0.015
 travelling+0.015
Neg logit contributions
 am-0.016
 loser-0.015
ses-0.014
kin-0.014
 hurts-0.014
Token He
Token ID679
Feature activation-6.12e-03

Pos logit contributions
 am+0.054
 loser+0.046
ses+0.044
kin+0.043
 hurts+0.043
Neg logit contributions
years-0.093
 airplanes-0.089
 chatting-0.088
Offic-0.088
 travelling-0.084
Token did
Token ID750
Feature activation-3.82e-02

Pos logit contributions
 am+0.011
 loser+0.011
 fault+0.011
ses+0.011
kin+0.010
Neg logit contributions
years-0.010
 airplanes-0.010
 chatting-0.010
Offic-0.009
 travelling-0.009
Token not
Token ID407
Feature activation-4.83e-02

Pos logit contributions
 am+0.047
 loser+0.037
 hurts+0.036
kin+0.034
ses+0.033
Neg logit contributions
years-0.096
Offic-0.090
 chatting-0.089
 airplanes-0.089
 terminal-0.085
Token like
Token ID588
Feature activation-8.91e-03

Pos logit contributions
 am+0.068
ses+0.067
 loser+0.062
 fault+0.062
kin+0.061
Neg logit contributions
years-0.095
 airplanes-0.093
 chatting-0.090
 travelling-0.090
Offic-0.090
Token to
Token ID284
Feature activation-2.67e-02

Pos logit contributions
 am+0.013
 loser+0.012
 hurts+0.011
ses+0.011
 fault+0.010
Neg logit contributions
years-0.019
Offic-0.018
 chatting-0.018
 airplanes-0.018
 terminal-0.017
Token talk
Token ID1561
Feature activation-1.03e-02

Pos logit contributions
 am+0.043
ses+0.040
 loser+0.039
kin+0.038
 fault+0.037
Neg logit contributions
 airplanes-0.049
years-0.049
 chatting-0.048
Offic-0.047
 inviting-0.046
Token to
Token ID284
Feature activation+3.79e-03

Pos logit contributions
 am+0.016
 loser+0.014
 hurts+0.014
ses+0.013
kin+0.013
Neg logit contributions
years-0.021
 airplanes-0.020
Offic-0.020
 chatting-0.020
 travelling-0.019
Token anyone
Token ID2687
Feature activation-5.63e-02

Pos logit contributions
years+0.007
 airplanes+0.006
 chatting+0.006
Offic+0.006
 travelling+0.006
Neg logit contributions
 am-0.007
 loser-0.006
ses-0.006
 hurts-0.006
kin-0.006
Token need
Token ID761
Feature activation-4.24e-02

Pos logit contributions
 am+0.073
ses+0.073
kin+0.068
 hurts+0.065
 loser+0.065
Neg logit contributions
years-0.098
 airplanes-0.093
 chatting-0.092
Offic-0.092
 travelling-0.089
Token Sam
Token ID3409
Feature activation-1.78e-02

Pos logit contributions
 am+0.073
 loser+0.065
kin+0.063
ses+0.062
 hurts+0.061
Neg logit contributions
years-0.081
 airplanes-0.077
 chatting-0.076
Offic-0.075
 travelling-0.072
Token.
Token ID13
Feature activation-1.70e-02

Pos logit contributions
 am+0.029
ses+0.027
kin+0.026
 loser+0.026
ger+0.024
Neg logit contributions
 airplanes-0.032
years-0.031
 chatting-0.031
 visiting-0.029
 travelling-0.029
Token He
Token ID679
Feature activation-1.18e-02

Pos logit contributions
 am+0.029
kin+0.028
 fault+0.025
 loser+0.025
ses+0.024
Neg logit contributions
years-0.029
Offic-0.029
 airplanes-0.026
 inviting-0.026
 libraries-0.026
Token is
Token ID318
Feature activation-1.27e-02

Pos logit contributions
kin+0.013
 loser+0.012
 am+0.011
ses+0.011
ro+0.010
Neg logit contributions
 airplanes-0.027
years-0.026
 chatting-0.026
 talked-0.025
 travelling-0.025
Token too
Token ID1165
Feature activation-9.32e-02

Pos logit contributions
 am+0.022
 loser+0.019
ses+0.019
kin+0.019
 hurts+0.018
Neg logit contributions
years-0.024
 chatting-0.022
 airplanes-0.022
 travelling-0.021
Offic-0.021
Token heavy
Token ID4334
Feature activation-4.41e-02

Pos logit contributions
 am+0.176
ses+0.158
 loser+0.155
kin+0.150
ger+0.146
Neg logit contributions
years-0.156
 airplanes-0.152
 chatting-0.148
 inviting-0.146
Offic-0.145
Token to
Token ID284
Feature activation-3.00e-02

Pos logit contributions
 am+0.078
kin+0.070
 loser+0.069
ses+0.068
 hurts+0.063
Neg logit contributions
 airplanes-0.079
years-0.077
 chatting-0.075
 travelling-0.073
 visiting-0.072
Token go
Token ID467
Feature activation+3.49e-03

Pos logit contributions
 am+0.046
ses+0.045
 loser+0.044
 fault+0.043
kin+0.041
Neg logit contributions
 airplanes-0.055
years-0.053
 inviting-0.053
 chatting-0.053
Offic-0.053
Token up
Token ID510
Feature activation-5.28e-02

Pos logit contributions
years+0.007
 airplanes+0.007
 chatting+0.007
Offic+0.007
 travelling+0.007
Neg logit contributions
 am-0.005
ses-0.005
 loser-0.005
kin-0.005
 hurts-0.004
Token the
Token ID262
Feature activation-2.83e-02

Pos logit contributions
 am+0.079
 loser+0.069
ses+0.068
kin+0.066
 hurts+0.065
Neg logit contributions
years-0.110
 airplanes-0.107
 chatting-0.105
Offic-0.103
 travelling-0.101
Token.
Token ID13
Feature activation-2.75e-02

Pos logit contributions
 am+0.147
ses+0.144
kin+0.137
 loser+0.133
ger+0.128
Neg logit contributions
 chatting-0.189
 visiting-0.187
years-0.186
 airplanes-0.186
 travelling-0.184
Token I
Token ID314
Feature activation-5.79e-02

Pos logit contributions
 am+0.041
kin+0.037
 loser+0.035
 fault+0.034
ger+0.033
Neg logit contributions
years-0.057
Offic-0.054
 libraries-0.050
 inviting-0.050
 visiting-0.049
Token won
Token ID1839
Feature activation-7.49e-02

Pos logit contributions
kin+0.084
ses+0.083
 loser+0.082
ger+0.072
ble+0.071
Neg logit contributions
 talked-0.107
years-0.101
Offic-0.099
 chatting-0.098
 inviting-0.094
Token't
Token ID470
Feature activation-4.63e-02

Pos logit contributions
ses+0.119
 am+0.116
kin+0.114
 loser+0.108
ger+0.106
Neg logit contributions
 airplanes-0.141
years-0.138
Offic-0.137
 chatting-0.133
 inviting-0.131
Token do
Token ID466
Feature activation-4.77e-02

Pos logit contributions
ses+0.075
kin+0.070
 am+0.070
 fault+0.066
 loser+0.066
Neg logit contributions
 airplanes-0.082
 inviting-0.080
Offic-0.080
 chatting-0.079
years-0.079
Token it
Token ID340
Feature activation-1.01e-01

Pos logit contributions
 am+0.097
ses+0.093
kin+0.088
 loser+0.088
ro+0.086
Neg logit contributions
 airplanes-0.071
years-0.071
 chatting-0.066
 inviting-0.064
 travelling-0.064
Token again
Token ID757
Feature activation-8.37e-02

Pos logit contributions
 am+0.155
ses+0.152
kin+0.138
 loser+0.135
ger+0.125
Neg logit contributions
years-0.194
 airplanes-0.190
 chatting-0.187
 inviting-0.179
Offic-0.177
Token.
Token ID13
Feature activation-2.16e-02

Pos logit contributions
 am+0.104
ses+0.101
kin+0.092
 loser+0.091
ger+0.083
Neg logit contributions
years-0.185
 airplanes-0.181
 chatting-0.180
 inviting-0.175
Offic-0.175
Token I
Token ID314
Feature activation-5.62e-02

Pos logit contributions
 am+0.031
kin+0.028
 loser+0.027
 fault+0.026
ger+0.026
Neg logit contributions
years-0.045
Offic-0.044
 inviting-0.041
 libraries-0.041
 airplanes-0.040
Token'm
Token ID1101
Feature activation-1.08e-02

Pos logit contributions
kin+0.081
ses+0.080
 loser+0.078
ble+0.069
ger+0.069
Neg logit contributions
 talked-0.104
years-0.100
 chatting-0.098
Offic-0.098
 inviting-0.094
Token sorry
Token ID7926
Feature activation-5.50e-02

Pos logit contributions
 am+0.016
 loser+0.014
ses+0.013
kin+0.013
 hurts+0.013
Neg logit contributions
years-0.023
 airplanes-0.022
Offic-0.022
 chatting-0.022
 cities-0.021
Token.
Token ID13
Feature activation-2.41e-02

Pos logit contributions
 am+0.057
ses+0.053
 loser+0.053
kin+0.053
+0.050
Neg logit contributions
 airplanes-0.063
 inviting-0.062
 chatting-0.062
years-0.061
 travelling-0.059
Token He
Token ID679
Feature activation-9.20e-03

Pos logit contributions
 am+0.049
kin+0.049
bery+0.048
ses+0.048
uch+0.047
Neg logit contributions
years-0.028
 libraries-0.027
Offic-0.025
 cities-0.025
 forests-0.024
Token realized
Token ID6939
Feature activation+2.23e-02

Pos logit contributions
 am+0.017
 loser+0.016
kin+0.015
 fault+0.015
ses+0.015
Neg logit contributions
years-0.016
 chatting-0.015
 airplanes-0.014
Offic-0.014
 inviting-0.014
Token he
Token ID339
Feature activation-1.89e-02

Pos logit contributions
years+0.042
 airplanes+0.040
 chatting+0.040
Offic+0.039
 travelling+0.037
Neg logit contributions
 am-0.039
 loser-0.034
ses-0.033
kin-0.033
 hurts-0.033
Token was
Token ID373
Feature activation+6.65e-04

Pos logit contributions
 loser+0.036
ses+0.034
 fault+0.033
 am+0.032
kin+0.032
Neg logit contributions
years-0.028
 talked-0.028
 chatting-0.026
 airplanes-0.026
Offic-0.025
Token wrong
Token ID2642
Feature activation-9.31e-02

Pos logit contributions
years+0.001
 chatting+0.001
 inviting+0.001
 travelling+0.001
 airplanes+0.001
Neg logit contributions
 am-0.001
ses-0.001
 loser-0.001
ro-0.001
kin-0.001
Token.
Token ID13
Feature activation-2.35e-02

Pos logit contributions
ses+0.142
 am+0.140
 loser+0.135
kin+0.129
+0.129
Neg logit contributions
 chatting-0.164
 visiting-0.164
years-0.163
 travelling-0.160
 airplanes-0.160
Token He
Token ID679
Feature activation-3.78e-03

Pos logit contributions
 am+0.045
kin+0.044
ses+0.044
bery+0.042
chy+0.042
Neg logit contributions
years-0.033
Offic-0.032
 libraries-0.030
 cities-0.028
 inviting-0.028
Token said
Token ID531
Feature activation+3.51e-02

Pos logit contributions
 am+0.007
 loser+0.006
kin+0.006
ses+0.006
 fault+0.006
Neg logit contributions
years-0.007
 airplanes-0.006
 chatting-0.006
Offic-0.006
 inviting-0.006
Token,
Token ID11
Feature activation+2.80e-02

Pos logit contributions
years+0.062
 airplanes+0.060
 chatting+0.058
 travelling+0.055
 visiting+0.055
Neg logit contributions
 am-0.058
ses-0.057
 loser-0.053
kin-0.053
 fault-0.051
Token "
Token ID366
Feature activation+9.62e-03

Pos logit contributions
years+0.055
 airplanes+0.053
 chatting+0.052
Offic+0.051
 terminal+0.049
Neg logit contributions
 am-0.047
 loser-0.040
ses-0.039
 hurts-0.039
kin-0.038
Token't
Token ID470
Feature activation-6.08e-02

Pos logit contributions
ses+0.106
 loser+0.103
 am+0.100
ger+0.096
 fault+0.096
Neg logit contributions
 airplanes-0.129
years-0.126
 chatting-0.122
 inviting-0.121
Offic-0.118
Token.
Token ID13
Feature activation-3.54e-02

Pos logit contributions
ses+0.092
 am+0.086
 loser+0.085
 fault+0.083
 hurts+0.078
Neg logit contributions
 airplanes-0.118
years-0.116
 chatting-0.111
 inviting-0.110
 travelling-0.108
Token The
Token ID383
Feature activation+4.99e-03

Pos logit contributions
 am+0.049
 loser+0.042
kin+0.042
 fault+0.041
ses+0.040
Neg logit contributions
years-0.077
Offic-0.074
 airplanes-0.070
 inviting-0.070
 chatting-0.070
Token needle
Token ID17598
Feature activation-1.65e-02

Pos logit contributions
 airplanes+0.009
years+0.008
Offic+0.008
 chatting+0.008
 inviting+0.008
Neg logit contributions
 am-0.009
ses-0.008
 loser-0.008
 hurts-0.007
kin-0.007
Token is
Token ID318
Feature activation-2.11e-02

Pos logit contributions
 loser+0.017
 am+0.015
ses+0.014
kin+0.013
+0.013
Neg logit contributions
 airplanes-0.038
 chatting-0.036
years-0.035
 talked-0.034
 travelling-0.034
Token too
Token ID1165
Feature activation-9.42e-02

Pos logit contributions
 am+0.036
 loser+0.032
ses+0.031
kin+0.030
 hurts+0.030
Neg logit contributions
years-0.040
 airplanes-0.038
 chatting-0.038
Offic-0.036
 travelling-0.036
Token big
Token ID1263
Feature activation-1.75e-02

Pos logit contributions
 am+0.171
 loser+0.158
ses+0.153
ger+0.144
kin+0.143
Neg logit contributions
years-0.161
 airplanes-0.159
 chatting-0.153
 inviting-0.149
Offic-0.148
Token and
Token ID290
Feature activation+4.30e-02

Pos logit contributions
 am+0.032
kin+0.029
 loser+0.029
ses+0.028
 hurts+0.027
Neg logit contributions
 airplanes-0.030
years-0.029
 chatting-0.028
 travelling-0.027
Offic-0.027
Token the
Token ID262
Feature activation-1.69e-02

Pos logit contributions
years+0.086
 airplanes+0.083
 chatting+0.080
Offic+0.080
 inviting+0.079
Neg logit contributions
 am-0.067
 loser-0.061
ses-0.056
kin-0.056
 fault-0.055
Token cloth
Token ID16270
Feature activation-1.76e-02

Pos logit contributions
 am+0.030
ses+0.026
 loser+0.026
 hurts+0.025
kin+0.025
Neg logit contributions
 airplanes-0.029
years-0.027
 chatting-0.026
Offic-0.026
 travelling-0.026
Token is
Token ID318
Feature activation-2.71e-02

Pos logit contributions
 loser+0.022
 am+0.019
ses+0.019
+0.017
inea+0.017
Neg logit contributions
 airplanes-0.038
 chatting-0.034
ents-0.033
 inviting-0.033
 talked-0.032
Token.
Token ID13
Feature activation-1.90e-02

Pos logit contributions
 am+0.063
kin+0.055
 loser+0.055
ses+0.054
 hurts+0.051
Neg logit contributions
years-0.077
 airplanes-0.075
 chatting-0.075
Offic-0.071
 travelling-0.070
Token He
Token ID679
Feature activation-3.36e-03

Pos logit contributions
 am+0.030
kin+0.029
 loser+0.028
ses+0.027
 fault+0.027
Neg logit contributions
years-0.036
Offic-0.035
 inviting-0.033
 libraries-0.032
 chatting-0.032
Token said
Token ID531
Feature activation+3.32e-02

Pos logit contributions
 am+0.006
 loser+0.005
ses+0.005
 hurts+0.005
kin+0.005
Neg logit contributions
years-0.006
 airplanes-0.006
 chatting-0.006
Offic-0.006
 travelling-0.006
Token,
Token ID11
Feature activation+2.83e-02

Pos logit contributions
years+0.062
 airplanes+0.061
 chatting+0.059
 travelling+0.056
Offic+0.056
Neg logit contributions
 am-0.053
ses-0.050
kin-0.049
 loser-0.048
 hurts-0.045
Token "
Token ID366
Feature activation+9.38e-03

Pos logit contributions
years+0.059
 airplanes+0.056
 chatting+0.055
Offic+0.054
 terminal+0.053
Neg logit contributions
 am-0.044
 loser-0.038
ses-0.037
 hurts-0.036
kin-0.035
TokenWhat
Token ID2061
Feature activation-1.04e-01

Pos logit contributions
years+0.018
 airplanes+0.017
 chatting+0.017
Offic+0.017
 travelling+0.016
Neg logit contributions
 am-0.016
 loser-0.014
ses-0.013
 hurts-0.013
kin-0.013
Token are
Token ID389
Feature activation-4.46e-02

Pos logit contributions
ses+0.111
 am+0.109
 loser+0.106
ro+0.101
ger+0.101
Neg logit contributions
 airplanes-0.235
years-0.234
Offic-0.229
 chatting-0.227
 libraries-0.222
Token you
Token ID345
Feature activation-1.06e-01

Pos logit contributions
 am+0.065
 loser+0.056
ses+0.053
 hurts+0.053
kin+0.053
Neg logit contributions
years-0.096
 airplanes-0.092
 chatting-0.092
Offic-0.091
 travelling-0.087
Token looking
Token ID2045
Feature activation-4.65e-02

Pos logit contributions
 am+0.233
 loser+0.211
ses+0.202
 hurts+0.202
kin+0.201
Neg logit contributions
years-0.155
 airplanes-0.145
Offic-0.143
 chatting-0.141
 terminal-0.133
Token at
Token ID379
Feature activation-1.48e-02

Pos logit contributions
 am+0.096
ses+0.089
 loser+0.089
kin+0.088
 hurts+0.086
Neg logit contributions
years-0.068
 airplanes-0.065
 chatting-0.063
 travelling-0.061
 inviting-0.059
Token,
Token ID11
Feature activation-4.29e-04

Pos logit contributions
 am+0.024
 loser+0.022
ses+0.021
kin+0.020
 hurts+0.020
Neg logit contributions
years-0.029
 airplanes-0.029
 chatting-0.027
Offic-0.027
 travelling-0.027