TOP ACTIVATIONS
MAX = 4.104

matter how sour something might seem , sweet things might be
matter how stupid it might seem . <|endoftext|> Once upon a
matter how uncomfortable something may seem at first , if we
how miserable the task may seem , courage is needed and
day after school . As time passed , Bob 's bows
" Jamie ," her mum said , " It
idea , sweet ie ," her Mom my said . "
. When bed time arrived , Freddy had finished
for a special purpose ," her mom explained . " You
" Maybe one day ," her Mom said , " But
When the sun began to set , Lucy 's
pipe , Sus ie ," her mum replied .
test . When afternoon came , Molly was ready
That man is bald ", her mum replied .
matter how distant something may seem , it is still attain
matter how adorable they may seem , it 's best to
stamp every day . As time passed , he started to
But after a few turns , Charlie got impatient and
But as the sun started to go down , Tim
I pick it please ?" Her Mum looked at the price

MIN ACTIVATIONS
MIN = -4.512

They find the rh in oc eros . They are gentle
turned out that the av oc ados were bad and made
had discovered the rh in oc eros , but they had
Look , James ! An aer opl ane !" said Dad
It was the rh in oc eros . It lifted its
eros . The rh in oc eros was very big and
Did you know that av oc ados are very healthy for
playing ! The rh in oc eros was very worried ,
s wrong , rh in oc eros ? â € said
They saw the rh in oc eros . They thanked Mom
hurt . The rh in oc eros kept sn orting and
net on the rh in oc eros . They put the
They put the rh in oc eros in the truck .
loud . The rh in oc eros was so scared .
She saw the rh in oc eros and knew he was
care of the rh in oc eros and it needed to
try to grab the av oc ados . But the rock
It is the rh in oc eros !" The
it , the rh in oc eros smiled . He had
They heard the rh in oc eros sn ore .

INTERVAL 1.950 - 4.104
CONTAINS 0.498%

. One day , a little girl saw
could . When night came , the white
mom bought him the mint s and he enjoyed them so
the guitar and make fun sounds . He tried to play
happy . But one day when she was out playing ,

INTERVAL -0.204 - 1.950
CONTAINS 60.543%

catch his things , but they flew everywhere . The other
time and got very tired . Then they saw a big
. He forgot to eat his lunch . When
hear a voice . It is the farmer . He says
the o at soup in a bowl for her little girl

INTERVAL -2.358 - -0.204
CONTAINS 38.936%

more about the cookies and he asked his mom where they
there was a little girl named Lily who loved to play
the net . O ll ie was grateful and
Josh and Anne . Josh was a big , strong elephant
Anna and Ben nod . They sit next to Mom my

INTERVAL -4.512 - -2.358
CONTAINS 0.024%

stealing honey from the be eh ives without permission . She
, there was a big ostr ich who was very proud
She sees him under the bed . She sees the blanket
was still wearing her p aj amas .
, mom my ! The ostr ich is so fat !"
Token matter
Token ID2300
Feature activation-1.92e-02

Pos logit contributions
lor+0.008
 lair+0.008
 mistake+0.008
oral+0.008
 property+0.007
Neg logit contributions
 fit-0.003
cy-0.003
 went-0.003
 lined-0.003
 started-0.003
Token how
Token ID703
Feature activation+2.10e+00

Pos logit contributions
 lair+0.001
lor+0.001
oral+0.001
 mistake+0.001
cod+0.001
Neg logit contributions
 Tim-0.000
 really-0.000
 Bobby-0.000
 both-0.000
 went-0.000
Token sour
Token ID11348
Feature activation+2.36e+00

Pos logit contributions
 fit+0.032
 really+0.031
 went+0.029
 started+0.029
aved+0.028
Neg logit contributions
lor-0.098
 lair-0.097
 mistake-0.093
oral-0.088
land-0.087
Token something
Token ID1223
Feature activation+2.24e+00

Pos logit contributions
 lined+0.053
 fit+0.053
cy+0.052
aved+0.052
 swapped+0.050
Neg logit contributions
 mistake-0.088
 lair-0.084
lor-0.081
oral-0.080
 property-0.076
Token might
Token ID1244
Feature activation+1.35e+00

Pos logit contributions
 fit+0.058
 went+0.054
 lined+0.054
cy+0.053
 started+0.053
Neg logit contributions
 lair-0.081
 mistake-0.079
lor-0.078
oral-0.075
 property-0.069
Token seem
Token ID1283
Feature activation+4.10e+00

Pos logit contributions
 fit+0.030
 went+0.025
 really+0.025
 lined+0.023
 started+0.023
Neg logit contributions
lor-0.060
 lair-0.060
oral-0.055
 Majesty-0.054
 mistake-0.054
Token,
Token ID11
Feature activation+6.79e-01

Pos logit contributions
 fit+0.089
 lined+0.086
cy+0.082
 went+0.081
aved+0.081
Neg logit contributions
 mistake-0.159
 lair-0.157
lor-0.155
oral-0.150
 privacy-0.141
Token sweet
Token ID6029
Feature activation+4.43e-02

Pos logit contributions
 fit+0.010
 went+0.010
 really+0.010
 started+0.009
 matched+0.007
Neg logit contributions
 lair-0.034
lor-0.033
oral-0.032
 mistake-0.031
cod-0.030
Token things
Token ID1243
Feature activation+7.90e-01

Pos logit contributions
 went+0.001
 fit+0.001
cy+0.001
 started+0.001
 lined+0.001
Neg logit contributions
 lair-0.002
lor-0.002
 mistake-0.002
oral-0.002
cod-0.001
Token might
Token ID1244
Feature activation-4.56e-01

Pos logit contributions
 fit+0.016
 lined+0.016
aved+0.015
 went+0.015
cy+0.015
Neg logit contributions
 mistake-0.032
 lair-0.032
oral-0.030
lor-0.030
 Majesty-0.028
Token be
Token ID307
Feature activation+6.03e-01

Pos logit contributions
lor+0.021
 lair+0.020
oral+0.019
joy+0.018
 mistake+0.018
Neg logit contributions
 fit-0.010
 really-0.008
 went-0.007
 lined-0.007
 started-0.007
Token matter
Token ID2300
Feature activation-8.70e-02

Pos logit contributions
 mistake+0.042
 lair+0.041
lor+0.041
oral+0.040
 property+0.038
Neg logit contributions
 fit-0.018
 lined-0.018
cy-0.018
aved-0.017
 went-0.017
Token how
Token ID703
Feature activation+1.78e+00

Pos logit contributions
lor+0.004
 lair+0.004
oral+0.004
 mistake+0.003
 Majesty+0.003
Neg logit contributions
 Bobby-0.002
 really-0.002
 Tim-0.002
 lined-0.001
 fit-0.001
Token stupid
Token ID8531
Feature activation+2.26e+00

Pos logit contributions
 fit+0.032
 really+0.032
 went+0.030
 started+0.029
 lined+0.029
Neg logit contributions
lor-0.080
 lair-0.075
 mistake-0.072
oral-0.070
land-0.068
Token it
Token ID340
Feature activation+2.35e+00

Pos logit contributions
 fit+0.043
 lined+0.042
 went+0.040
cy+0.040
aved+0.039
Neg logit contributions
 mistake-0.094
 lair-0.091
lor-0.089
oral-0.087
 property-0.082
Token might
Token ID1244
Feature activation+1.28e+00

Pos logit contributions
 lined+0.041
 fit+0.041
cy+0.040
aved+0.040
 swapped+0.038
Neg logit contributions
 mistake-0.100
 lair-0.096
lor-0.093
oral-0.091
 property-0.088
Token seem
Token ID1283
Feature activation+4.02e+00

Pos logit contributions
 fit+0.027
 went+0.024
 lined+0.023
 really+0.022
 started+0.022
Neg logit contributions
lor-0.056
 lair-0.055
 mistake-0.053
oral-0.052
 Majesty-0.051
Token.
Token ID13
Feature activation-8.71e-02

Pos logit contributions
 fit+0.086
 lined+0.085
aved+0.081
cy+0.081
 went+0.080
Neg logit contributions
 mistake-0.157
 lair-0.152
lor-0.150
oral-0.145
 property-0.137
Token<|endoftext|>
Token ID50256
Feature activation+4.79e-01

Pos logit contributions
 mistake+0.004
 lair+0.003
lor+0.003
 property+0.003
 privacy+0.003
Neg logit contributions
cy-0.002
 went-0.002
 lined-0.002
 fit-0.002
 really-0.002
TokenOnce
Token ID7454
Feature activation+4.43e-01

Pos logit contributions
 lined+0.014
cy+0.014
aved+0.014
 fit+0.014
 went+0.013
Neg logit contributions
 mistake-0.015
 lair-0.014
lor-0.013
oral-0.013
 property-0.013
Token upon
Token ID2402
Feature activation+8.60e-01

Pos logit contributions
 lined+0.018
cy+0.018
 fit+0.018
aved+0.017
 swapped+0.017
Neg logit contributions
 mistake-0.009
 lair-0.008
lor-0.008
oral-0.007
 property-0.007
Token a
Token ID257
Feature activation+2.59e-01

Pos logit contributions
 fit+0.013
 lined+0.012
 went+0.012
cy+0.011
aved+0.011
Neg logit contributions
 mistake-0.040
 lair-0.038
lor-0.037
oral-0.037
 property-0.036
Token matter
Token ID2300
Feature activation+1.29e-01

Pos logit contributions
lor+0.021
 lair+0.020
oral+0.019
 mistake+0.019
 Majesty+0.018
Neg logit contributions
 fit-0.007
cy-0.007
 lined-0.006
 went-0.006
 really-0.006
Token how
Token ID703
Feature activation+1.94e+00

Pos logit contributions
 really+0.002
 Bobby+0.002
 fit+0.002
 as+0.002
 lined+0.002
Neg logit contributions
 lair-0.006
lor-0.006
oral-0.006
Offic-0.005
cod-0.005
Token uncomfortable
Token ID12916
Feature activation+2.82e+00

Pos logit contributions
 really+0.036
 fit+0.035
 started+0.032
 went+0.031
aved+0.030
Neg logit contributions
lor-0.088
 lair-0.085
 mistake-0.081
oral-0.078
Offic-0.077
Token something
Token ID1223
Feature activation+2.43e+00

Pos logit contributions
 fit+0.069
 went+0.066
 lined+0.065
 started+0.064
 really+0.061
Neg logit contributions
 lair-0.107
 mistake-0.102
oral-0.101
lor-0.101
cod-0.091
Token may
Token ID743
Feature activation+1.38e+00

Pos logit contributions
 fit+0.062
 went+0.057
 lined+0.057
 started+0.056
 really+0.055
Neg logit contributions
 lair-0.092
lor-0.089
 mistake-0.088
oral-0.086
 Majesty-0.079
Token seem
Token ID1283
Feature activation+3.99e+00

Pos logit contributions
 fit+0.024
 really+0.018
 went+0.017
 lined+0.016
 started+0.016
Neg logit contributions
lor-0.071
 lair-0.069
oral-0.066
Offic-0.065
joy-0.065
Token at
Token ID379
Feature activation+9.11e-01

Pos logit contributions
 fit+0.081
 lined+0.077
 went+0.072
 started+0.069
aved+0.069
Neg logit contributions
lor-0.164
 mistake-0.164
 lair-0.162
oral-0.157
 privacy-0.144
Token first
Token ID717
Feature activation+9.87e-01

Pos logit contributions
 fit+0.019
 really+0.017
 lined+0.016
 went+0.016
 matched+0.015
Neg logit contributions
lor-0.039
cod-0.037
oral-0.037
 lair-0.037
 mistake-0.036
Token,
Token ID11
Feature activation+5.68e-01

Pos logit contributions
 fit+0.017
 went+0.015
 lined+0.014
 started+0.014
 really+0.014
Neg logit contributions
 lair-0.045
oral-0.045
lor-0.044
 mistake-0.044
cod-0.042
Token if
Token ID611
Feature activation-5.17e-02

Pos logit contributions
 fit+0.012
 really+0.011
 went+0.011
 started+0.010
 Bobby+0.010
Neg logit contributions
 lair-0.024
lor-0.024
oral-0.023
 mistake-0.023
cod-0.022
Token we
Token ID356
Feature activation+3.68e-01

Pos logit contributions
 lair+0.002
lor+0.002
 mistake+0.002
cod+0.002
oral+0.002
Neg logit contributions
 fit-0.001
 went-0.001
 really-0.001
 started-0.001
 Bobby-0.001
Token how
Token ID703
Feature activation+2.15e+00

Pos logit contributions
 fit+0.034
 lined+0.032
 really+0.031
 went+0.030
 started+0.030
Neg logit contributions
lor-0.057
 lair-0.055
 mistake-0.054
oral-0.053
 Majesty-0.048
Token miserable
Token ID22444
Feature activation+2.44e+00

Pos logit contributions
 fit+0.043
 really+0.042
 started+0.040
 went+0.040
 lined+0.039
Neg logit contributions
lor-0.092
 lair-0.089
 mistake-0.084
oral-0.082
 Bible-0.079
Token the
Token ID262
Feature activation-1.28e-01

Pos logit contributions
 fit+0.056
 lined+0.055
 went+0.055
 started+0.052
cy+0.050
Neg logit contributions
 mistake-0.092
lor-0.091
 lair-0.090
oral-0.089
reetings-0.079
Token task
Token ID4876
Feature activation+1.02e+00

Pos logit contributions
lor+0.006
eds+0.005
 lair+0.005
asks+0.004
reetings+0.004
Neg logit contributions
 fit-0.003
 went-0.003
 really-0.003
 started-0.002
 lined-0.002
Token may
Token ID743
Feature activation+1.42e+00

Pos logit contributions
 fit+0.022
 lined+0.022
 went+0.021
 started+0.020
cy+0.020
Neg logit contributions
 mistake-0.040
 lair-0.039
lor-0.038
oral-0.037
 Majesty-0.036
Token seem
Token ID1283
Feature activation+3.95e+00

Pos logit contributions
 fit+0.029
 really+0.023
 went+0.023
 lined+0.021
 started+0.021
Neg logit contributions
lor-0.064
 lair-0.064
joy-0.061
 Bible-0.060
 mistake-0.060
Token,
Token ID11
Feature activation+4.37e-01

Pos logit contributions
 fit+0.082
 lined+0.080
 went+0.075
cy+0.074
aved+0.073
Neg logit contributions
 mistake-0.158
lor-0.155
 lair-0.155
oral-0.148
 privacy-0.139
Token courage
Token ID11917
Feature activation-1.25e+00

Pos logit contributions
 fit+0.007
 really+0.006
 went+0.006
 lined+0.006
 started+0.005
Neg logit contributions
lor-0.021
 lair-0.021
oral-0.019
 mistake-0.019
 Majesty-0.019
Token is
Token ID318
Feature activation+1.88e+00

Pos logit contributions
 lair+0.054
lor+0.051
 mistake+0.051
oral+0.051
 Majesty+0.050
Neg logit contributions
 fit-0.025
 lined-0.023
 went-0.023
 really-0.023
 started-0.023
Token needed
Token ID2622
Feature activation-5.12e-01

Pos logit contributions
 fit+0.046
 really+0.042
 lined+0.042
 went+0.042
 started+0.041
Neg logit contributions
lor-0.074
 lair-0.071
 mistake-0.069
oral-0.068
 Majesty-0.065
Token and
Token ID290
Feature activation+1.19e+00

Pos logit contributions
 lair+0.022
lor+0.022
 Majesty+0.022
oral+0.022
cod+0.021
Neg logit contributions
 fit-0.010
 went-0.009
 really-0.009
 started-0.009
 lined-0.008
Token day
Token ID1110
Feature activation-5.00e-01

Pos logit contributions
 mistake+0.030
 lair+0.029
lor+0.029
oral+0.028
 property+0.027
Neg logit contributions
 fit-0.008
cy-0.008
 lined-0.008
 went-0.008
aved-0.007
Token after
Token ID706
Feature activation+4.24e-01

Pos logit contributions
 mistake+0.020
 lair+0.019
lor+0.019
oral+0.018
 property+0.017
Neg logit contributions
 fit-0.011
 lined-0.011
 went-0.010
cy-0.010
aved-0.010
Token school
Token ID1524
Feature activation+3.31e-01

Pos logit contributions
 fit+0.007
 lined+0.007
 went+0.007
aved+0.006
 started+0.006
Neg logit contributions
 mistake-0.018
 lair-0.018
lor-0.018
oral-0.017
 property-0.017
Token.
Token ID13
Feature activation+1.75e-01

Pos logit contributions
 lined+0.007
 fit+0.007
 went+0.007
aved+0.006
cy+0.006
Neg logit contributions
 mistake-0.013
 lair-0.012
lor-0.012
oral-0.012
 property-0.012
Token As
Token ID1081
Feature activation+1.67e+00

Pos logit contributions
 lined+0.003
 went+0.003
 fit+0.003
cy+0.003
aved+0.003
Neg logit contributions
 mistake-0.008
 lair-0.008
lor-0.007
 property-0.007
oral-0.007
Token time
Token ID640
Feature activation+3.92e+00

Pos logit contributions
 lined+0.029
 fit+0.029
cy+0.028
aved+0.028
 went+0.027
Neg logit contributions
 mistake-0.072
 lair-0.069
lor-0.067
oral-0.066
 property-0.063
Token passed
Token ID3804
Feature activation+1.54e+00

Pos logit contributions
idget+0.081
cy+0.080
attered+0.078
ishy+0.076
organ+0.076
Neg logit contributions
oral-0.126
 mistake-0.122
 lair-0.117
joy-0.117
reetings-0.117
Token,
Token ID11
Feature activation+1.45e+00

Pos logit contributions
aved+0.037
 swapped+0.036
 lined+0.035
cy+0.035
 fit+0.035
Neg logit contributions
 mistake-0.054
 lair-0.051
lor-0.048
 witches-0.046
oral-0.046
Token Bob
Token ID5811
Feature activation-2.59e-02

Pos logit contributions
aved+0.031
cy+0.030
 lined+0.029
 swapped+0.029
 fit+0.028
Neg logit contributions
 mistake-0.056
 lair-0.053
oral-0.051
lor-0.050
 witches-0.049
Token's
Token ID338
Feature activation+6.36e-01

Pos logit contributions
 mistake+0.001
 lair+0.001
lor+0.001
oral+0.001
 privacy+0.001
Neg logit contributions
 went-0.000
 lined-0.000
 started-0.000
 fit-0.000
cy-0.000
Token bows
Token ID37469
Feature activation+1.55e-01

Pos logit contributions
 fit+0.019
 went+0.018
 lined+0.018
 swapped+0.018
cy+0.018
Neg logit contributions
 mistake-0.019
lor-0.019
 lair-0.018
oral-0.018
reetings-0.017
Token
Token ID198
Feature activation-2.55e-01

Pos logit contributions
 lined+0.005
 went+0.005
cy+0.004
 really+0.004
 started+0.004
Neg logit contributions
 mistake-0.017
 lair-0.016
 property-0.016
 privacy-0.016
cod-0.015
Token
Token ID198
Feature activation+4.44e-02

Pos logit contributions
 mistake+0.012
 lair+0.012
 property+0.012
 privacy+0.011
 witches+0.011
Neg logit contributions
 lined-0.004
 really-0.004
 went-0.004
inn-0.003
 fit-0.003
Token"
Token ID1
Feature activation+1.30e+00

Pos logit contributions
 lined+0.001
 really+0.001
 went+0.001
inn+0.001
 fit+0.001
Neg logit contributions
 mistake-0.002
 lair-0.002
 property-0.002
 privacy-0.002
 witches-0.001
TokenJamie
Token ID48337
Feature activation-2.72e-01

Pos logit contributions
cy+0.028
 lined+0.027
 went+0.026
 fit+0.026
 swapped+0.026
Neg logit contributions
 mistake-0.055
 lair-0.052
 property-0.047
lor-0.047
 privacy-0.046
Token,"
Token ID553
Feature activation+7.51e-01

Pos logit contributions
 lair+0.011
 mistake+0.011
oral+0.011
lor+0.011
 property+0.010
Neg logit contributions
 fit-0.005
cy-0.005
 lined-0.005
 went-0.005
 started-0.005
Token her
Token ID607
Feature activation+3.90e+00

Pos logit contributions
 fit+0.012
 lined+0.012
 went+0.012
cy+0.011
 started+0.011
Neg logit contributions
 mistake-0.034
 lair-0.033
lor-0.032
oral-0.031
 property-0.030
Token mum
Token ID25682
Feature activation+7.45e-01

Pos logit contributions
aved+0.105
 lined+0.100
cy+0.099
 swapped+0.097
organ+0.096
Neg logit contributions
 mistake-0.138
 lair-0.124
lor-0.118
oral-0.117
 Majesty-0.112
Token said
Token ID531
Feature activation-2.40e-01

Pos logit contributions
 lined+0.011
 fit+0.010
cy+0.010
aved+0.010
 swapped+0.009
Neg logit contributions
 mistake-0.034
 lair-0.033
lor-0.032
oral-0.032
 property-0.031
Token,
Token ID11
Feature activation-2.56e-01

Pos logit contributions
 mistake+0.009
 lair+0.008
lor+0.008
oral+0.008
 property+0.008
Neg logit contributions
 lined-0.005
cy-0.005
aved-0.005
 fit-0.005
 swapped-0.005
Token "
Token ID366
Feature activation-2.50e-01

Pos logit contributions
 mistake+0.009
 lair+0.009
oral+0.009
lor+0.009
 property+0.008
Neg logit contributions
 fit-0.006
 went-0.006
 lined-0.006
 started-0.006
 really-0.006
TokenIt
Token ID1026
Feature activation+7.91e-01

Pos logit contributions
 mistake+0.011
 lair+0.010
 privacy+0.009
 witches+0.009
 property+0.009
Neg logit contributions
cy-0.006
bo-0.005
 went-0.005
aved-0.005
 swapped-0.005
Token idea
Token ID2126
Feature activation+8.67e-01

Pos logit contributions
 fit+0.008
 went+0.007
 lined+0.006
 really+0.006
 matched+0.006
Neg logit contributions
lor-0.022
 lair-0.022
reetings-0.021
oral-0.021
 mistake-0.021
Token,
Token ID11
Feature activation+2.59e-01

Pos logit contributions
 fit+0.021
 lined+0.020
 went+0.019
 started+0.019
cy+0.018
Neg logit contributions
 lair-0.035
oral-0.034
lor-0.034
 mistake-0.033
 Majesty-0.031
Token sweet
Token ID6029
Feature activation+1.33e-01

Pos logit contributions
 fit+0.003
 went+0.003
 Tim+0.003
 lined+0.003
 really+0.003
Neg logit contributions
 lair-0.014
lor-0.013
oral-0.013
 mistake-0.012
 property-0.012
Tokenie
Token ID494
Feature activation-5.71e-01

Pos logit contributions
cy+0.003
 fit+0.003
 went+0.003
 lined+0.003
 started+0.003
Neg logit contributions
 lair-0.005
 mistake-0.005
lor-0.005
oral-0.005
 property-0.004
Token,"
Token ID553
Feature activation+1.25e+00

Pos logit contributions
 mistake+0.021
 lair+0.021
oral+0.021
lor+0.020
 property+0.018
Neg logit contributions
 fit-0.014
 went-0.013
cy-0.013
 lined-0.013
 started-0.012
Token her
Token ID607
Feature activation+3.90e+00

Pos logit contributions
aved+0.032
 lined+0.029
cy+0.028
organ+0.028
pool+0.027
Neg logit contributions
 mistake-0.046
oral-0.041
 lair-0.038
 reading-0.038
 witch-0.037
Token Mom
Token ID11254
Feature activation+1.55e+00

Pos logit contributions
aved+0.085
 lined+0.079
cy+0.073
pool+0.073
 swapped+0.071
Neg logit contributions
 mistake-0.163
 lair-0.143
oral-0.141
 privacy-0.132
lor-0.131
Tokenmy
Token ID1820
Feature activation+1.62e+00

Pos logit contributions
aved+0.049
 lined+0.047
pool+0.046
 fit+0.046
 Derek+0.046
Neg logit contributions
 mistake-0.044
 lair-0.038
oral-0.036
lor-0.036
 reading-0.035
Token said
Token ID531
Feature activation-4.68e-01

Pos logit contributions
aved+0.039
cy+0.036
pool+0.035
attered+0.034
organ+0.033
Neg logit contributions
 mistake-0.060
lor-0.053
oral-0.052
 lair-0.052
 reading-0.050
Token.
Token ID13
Feature activation-3.20e-02

Pos logit contributions
 mistake+0.015
 lair+0.013
lor+0.013
oral+0.012
 reading+0.012
Neg logit contributions
aved-0.013
cy-0.013
 lined-0.013
organ-0.013
 Derek-0.013
Token "
Token ID366
Feature activation-4.20e-01

Pos logit contributions
 mistake+0.001
 lair+0.001
lor+0.001
 property+0.001
oral+0.001
Neg logit contributions
 lined-0.001
 fit-0.001
 went-0.001
 swapped-0.001
cy-0.001
Token.
Token ID13
Feature activation-2.09e-01

Pos logit contributions
 fit+0.005
 lined+0.005
cy+0.005
 went+0.005
aved+0.005
Neg logit contributions
 mistake-0.011
 lair-0.011
oral-0.010
lor-0.010
 property-0.009
Token
Token ID198
Feature activation-5.03e-01

Pos logit contributions
 mistake+0.009
 lair+0.008
lor+0.008
 property+0.008
oral+0.008
Neg logit contributions
cy-0.004
 lined-0.004
aved-0.004
 fit-0.004
 went-0.004
Token
Token ID198
Feature activation+1.73e-02

Pos logit contributions
 mistake+0.021
 lair+0.021
 property+0.020
 privacy+0.019
 witches+0.019
Neg logit contributions
inn-0.009
 really-0.009
aved-0.009
 lined-0.009
 fit-0.009
TokenWhen
Token ID2215
Feature activation+2.38e+00

Pos logit contributions
inn+0.000
 fit+0.000
 really+0.000
 went+0.000
 swapped+0.000
Neg logit contributions
 mistake-0.001
 lair-0.001
 Majesty-0.001
 privacy-0.001
lor-0.001
Token bed
Token ID3996
Feature activation+1.91e+00

Pos logit contributions
 lined+0.051
aved+0.051
cy+0.049
 fit+0.048
 swapped+0.047
Neg logit contributions
 mistake-0.093
 lair-0.086
oral-0.084
lor-0.083
 witches-0.080
Tokentime
Token ID2435
Feature activation+3.90e+00

Pos logit contributions
 lined+0.040
cy+0.040
aved+0.040
 fit+0.039
 swapped+0.038
Neg logit contributions
 mistake-0.074
 lair-0.070
lor-0.068
oral-0.067
 property-0.064
Token arrived
Token ID5284
Feature activation+1.00e+00

Pos logit contributions
aved+0.129
idget+0.122
cy+0.119
organ+0.118
attered+0.117
Neg logit contributions
 mistake-0.104
oral-0.095
 story-0.088
lor-0.086
 witch-0.083
Token,
Token ID11
Feature activation+1.22e+00

Pos logit contributions
 fit+0.017
cy+0.017
 went+0.017
 lined+0.017
 started+0.016
Neg logit contributions
 mistake-0.044
 lair-0.043
lor-0.042
oral-0.041
 property-0.039
Token Freddy
Token ID45437
Feature activation-4.37e-03

Pos logit contributions
 fit+0.021
 lined+0.021
cy+0.021
 went+0.020
aved+0.020
Neg logit contributions
 mistake-0.053
 lair-0.051
lor-0.050
oral-0.048
 property-0.047
Token had
Token ID550
Feature activation-1.14e+00

Pos logit contributions
 mistake+0.000
oral+0.000
 lair+0.000
lor+0.000
 privacy+0.000
Neg logit contributions
 went-0.000
 started-0.000
 fit-0.000
 lined-0.000
cy-0.000
Token finished
Token ID5201
Feature activation+2.82e-01

Pos logit contributions
 mistake+0.052
oral+0.050
lor+0.049
 lair+0.049
 property+0.047
Neg logit contributions
 fit-0.020
 lined-0.019
 went-0.019
 started-0.019
 swapped-0.019
Token for
Token ID329
Feature activation-6.23e-01

Pos logit contributions
 fit+0.002
 lined+0.002
 went+0.002
 started+0.002
 really+0.002
Neg logit contributions
 mistake-0.005
oral-0.005
lor-0.005
 lair-0.005
 Majesty-0.004
Token a
Token ID257
Feature activation+4.16e-01

Pos logit contributions
lor+0.024
oral+0.024
 lair+0.023
 mistake+0.023
reetings+0.021
Neg logit contributions
 fit-0.015
 really-0.014
 went-0.013
 lined-0.013
 started-0.013
Token special
Token ID2041
Feature activation+1.12e+00

Pos logit contributions
 fit+0.010
 went+0.009
 really+0.009
 started+0.009
 lined+0.009
Neg logit contributions
lor-0.015
 mistake-0.015
 lair-0.015
oral-0.015
 property-0.014
Token purpose
Token ID4007
Feature activation+6.10e-01

Pos logit contributions
 fit+0.027
 went+0.025
 lined+0.024
 really+0.024
 started+0.024
Neg logit contributions
lor-0.040
 lair-0.040
 mistake-0.040
oral-0.038
 yourselves-0.037
Token,"
Token ID553
Feature activation+1.64e+00

Pos logit contributions
 fit+0.015
 lined+0.015
cy+0.014
 went+0.014
 started+0.014
Neg logit contributions
 mistake-0.022
lor-0.022
 lair-0.021
oral-0.021
joy-0.020
Token her
Token ID607
Feature activation+3.88e+00

Pos logit contributions
aved+0.033
 lined+0.030
cy+0.029
pool+0.028
 fit+0.028
Neg logit contributions
 mistake-0.068
oral-0.062
 lair-0.060
 privacy-0.059
lor-0.059
Token mom
Token ID1995
Feature activation+1.16e+00

Pos logit contributions
 lined+0.070
 fit+0.070
cy+0.069
aved+0.068
 swapped+0.065
Neg logit contributions
 mistake-0.164
 lair-0.157
lor-0.153
oral-0.149
 property-0.144
Token explained
Token ID4893
Feature activation+9.16e-01

Pos logit contributions
aved+0.025
cy+0.023
 lined+0.022
 fit+0.022
pool+0.022
Neg logit contributions
 mistake-0.046
 lair-0.041
lor-0.040
oral-0.040
 reading-0.038
Token.
Token ID13
Feature activation+4.93e-01

Pos logit contributions
 lined+0.020
 fit+0.019
cy+0.019
aved+0.019
 swapped+0.018
Neg logit contributions
 mistake-0.035
 lair-0.034
lor-0.033
oral-0.032
 property-0.031
Token "
Token ID366
Feature activation-7.44e-02

Pos logit contributions
 lined+0.012
 went+0.012
 fit+0.011
 started+0.011
 really+0.011
Neg logit contributions
 mistake-0.019
 lair-0.018
lor-0.017
oral-0.017
 property-0.017
TokenYou
Token ID1639
Feature activation+3.92e-02

Pos logit contributions
 mistake+0.003
 lair+0.003
 Majesty+0.003
lor+0.003
 majesty+0.003
Neg logit contributions
 lined-0.002
 really-0.002
cy-0.002
organ-0.002
 fit-0.001
Token"
Token ID1
Feature activation+4.24e-01

Pos logit contributions
 lined+0.020
 fit+0.020
cy+0.020
 swapped+0.019
 went+0.019
Neg logit contributions
 mistake-0.023
 lair-0.022
lor-0.021
oral-0.020
 property-0.019
TokenMaybe
Token ID13300
Feature activation-1.31e+00

Pos logit contributions
aved+0.008
 lined+0.008
cy+0.007
 fit+0.007
 swapped+0.007
Neg logit contributions
 mistake-0.017
 lair-0.017
lor-0.017
oral-0.016
 property-0.015
Token one
Token ID530
Feature activation-6.62e-01

Pos logit contributions
 mistake+0.056
 lair+0.056
lor+0.055
oral+0.054
cod+0.051
Neg logit contributions
 went-0.028
 fit-0.028
 started-0.025
 lined-0.025
 really-0.024
Token day
Token ID1110
Feature activation+5.59e-01

Pos logit contributions
 mistake+0.031
 lair+0.030
lor+0.030
joy+0.028
 Majesty+0.028
Neg logit contributions
 went-0.011
 fit-0.011
 lined-0.011
cy-0.010
 really-0.010
Token,"
Token ID553
Feature activation+1.25e+00

Pos logit contributions
 went+0.012
 fit+0.011
 started+0.010
 really+0.010
 got+0.010
Neg logit contributions
 lair-0.026
oral-0.024
 mistake-0.024
lor-0.024
cod-0.024
Token her
Token ID607
Feature activation+3.87e+00

Pos logit contributions
aved+0.035
 lined+0.032
cy+0.031
organ+0.031
attered+0.031
Neg logit contributions
 mistake-0.043
oral-0.037
lor-0.035
 privacy-0.035
 witch-0.035
Token Mom
Token ID11254
Feature activation+1.62e+00

Pos logit contributions
aved+0.090
 lined+0.087
cy+0.083
 swapped+0.082
pool+0.081
Neg logit contributions
 mistake-0.151
 lair-0.137
oral-0.132
lor-0.126
 Majesty-0.123
Token said
Token ID531
Feature activation-1.89e-01

Pos logit contributions
aved+0.037
cy+0.034
pool+0.033
attered+0.032
 fit+0.032
Neg logit contributions
 mistake-0.059
lor-0.053
 lair-0.053
oral-0.053
land-0.052
Token,
Token ID11
Feature activation-4.08e-01

Pos logit contributions
 mistake+0.006
 lair+0.005
lor+0.005
land+0.005
 reading+0.005
Neg logit contributions
aved-0.005
cy-0.005
 lined-0.005
organ-0.005
 Derek-0.005
Token "
Token ID366
Feature activation-4.10e-01

Pos logit contributions
 lair+0.014
 mistake+0.014
oral+0.013
lor+0.013
 privacy+0.013
Neg logit contributions
 lined-0.011
 fit-0.011
 went-0.011
 started-0.011
 swapped-0.011
TokenBut
Token ID1537
Feature activation+1.75e-01

Pos logit contributions
 mistake+0.018
 lair+0.017
 witches+0.015
 Majesty+0.015
 majesty+0.015
Neg logit contributions
cy-0.009
 went-0.008
 fit-0.008
 lined-0.008
 swapped-0.008
Token
Token ID198
Feature activation-6.13e-01

Pos logit contributions
 mistake+0.006
 lair+0.006
lor+0.005
oral+0.005
 property+0.005
Neg logit contributions
 lined-0.003
cy-0.003
 fit-0.003
aved-0.003
 went-0.003
Token
Token ID198
Feature activation-2.66e-01

Pos logit contributions
 mistake+0.023
 lair+0.023
lor+0.021
 property+0.021
 witches+0.020
Neg logit contributions
 fit-0.014
 lined-0.014
 went-0.013
aved-0.013
cy-0.013
TokenWhen
Token ID2215
Feature activation+2.11e+00

Pos logit contributions
 lair+0.011
 mistake+0.011
 Majesty+0.010
 property+0.010
 witches+0.009
Neg logit contributions
 fit-0.006
 went-0.006
organ-0.006
 lined-0.006
 really-0.006
Token the
Token ID262
Feature activation+9.01e-01

Pos logit contributions
aved+0.047
 lined+0.047
cy+0.045
 fit+0.045
 swapped+0.045
Neg logit contributions
 mistake-0.081
 lair-0.073
lor-0.072
oral-0.072
 witches-0.068
Token sun
Token ID4252
Feature activation+1.85e+00

Pos logit contributions
 lined+0.022
aved+0.022
cy+0.021
 fit+0.021
 swapped+0.021
Neg logit contributions
 mistake-0.033
 lair-0.031
oral-0.029
lor-0.029
 property-0.028
Token began
Token ID2540
Feature activation+3.86e+00

Pos logit contributions
aved+0.029
organ+0.027
 lined+0.027
 swapped+0.026
pool+0.026
Neg logit contributions
 mistake-0.078
 lair-0.073
lor-0.073
oral-0.073
 witch-0.067
Token to
Token ID284
Feature activation+1.34e+00

Pos logit contributions
cy+0.110
aved+0.109
 swapped+0.109
organ+0.108
 lined+0.107
Neg logit contributions
 mistake-0.116
 lair-0.104
oral-0.101
lor-0.097
joy-0.092
Token set
Token ID900
Feature activation+1.32e+00

Pos logit contributions
 lined+0.020
aved+0.020
cy+0.020
 fit+0.020
 swapped+0.019
Neg logit contributions
 mistake-0.060
 lair-0.057
lor-0.056
oral-0.055
 property-0.053
Token,
Token ID11
Feature activation+6.65e-01

Pos logit contributions
 fit+0.025
 lined+0.025
cy+0.024
 went+0.024
aved+0.023
Neg logit contributions
 mistake-0.055
 lair-0.054
lor-0.052
oral-0.051
 property-0.049
Token Lucy
Token ID22162
Feature activation-4.59e-03

Pos logit contributions
 fit+0.009
 lined+0.009
cy+0.009
 went+0.009
 started+0.008
Neg logit contributions
 mistake-0.031
 lair-0.031
lor-0.030
oral-0.029
 property-0.028
Token's
Token ID338
Feature activation+8.34e-01

Pos logit contributions
 mistake+0.000
 lair+0.000
oral+0.000
lor+0.000
 property+0.000
Neg logit contributions
 went-0.000
 lined-0.000
 fit-0.000
 started-0.000
cy-0.000
Token pipe
Token ID12656
Feature activation-1.55e-01

Pos logit contributions
lor+0.016
 mistake+0.015
 lair+0.015
oral+0.015
reetings+0.014
Neg logit contributions
 fit-0.010
 went-0.009
 lined-0.009
 started-0.009
cy-0.009
Token,
Token ID11
Feature activation-1.74e-01

Pos logit contributions
oral+0.006
 mistake+0.006
joy+0.006
 lair+0.006
lor+0.006
Neg logit contributions
 fit-0.004
 lined-0.003
 went-0.003
 started-0.003
 really-0.003
Token Sus
Token ID8932
Feature activation-1.60e-01

Pos logit contributions
 lair+0.009
lor+0.008
oral+0.008
 mistake+0.008
reetings+0.007
Neg logit contributions
 fit-0.003
 lined-0.003
 really-0.003
 went-0.003
 Bobby-0.003
Tokenie
Token ID494
Feature activation-6.68e-01

Pos logit contributions
 lair+0.008
 mistake+0.007
oral+0.007
lor+0.007
 privacy+0.007
Neg logit contributions
cy-0.002
 lined-0.002
 went-0.002
 fit-0.002
aved-0.002
Token,"
Token ID553
Feature activation+1.37e+00

Pos logit contributions
 lair+0.025
oral+0.025
 mistake+0.024
lor+0.024
joy+0.022
Neg logit contributions
 fit-0.016
 went-0.016
 started-0.015
 lined-0.015
cy-0.015
Token her
Token ID607
Feature activation+3.84e+00

Pos logit contributions
aved+0.027
cy+0.026
 lined+0.026
 fit+0.024
 swapped+0.024
Neg logit contributions
 mistake-0.056
 lair-0.051
oral-0.051
lor-0.051
 privacy-0.048
Token mum
Token ID25682
Feature activation+9.40e-01

Pos logit contributions
aved+0.098
 lined+0.095
cy+0.093
 swapped+0.091
 fit+0.090
Neg logit contributions
 mistake-0.139
 lair-0.127
lor-0.120
oral-0.119
 Majesty-0.114
Token replied
Token ID8712
Feature activation+9.71e-03

Pos logit contributions
aved+0.017
cy+0.017
 lined+0.016
 fit+0.016
 swapped+0.016
Neg logit contributions
 mistake-0.040
 lair-0.037
lor-0.036
oral-0.035
 property-0.034
Token.
Token ID13
Feature activation+1.16e-01

Pos logit contributions
 fit+0.000
 lined+0.000
cy+0.000
aved+0.000
 went+0.000
Neg logit contributions
 mistake-0.000
 lair-0.000
lor-0.000
oral-0.000
 property-0.000
Token
Token ID198
Feature activation+3.22e-02

Pos logit contributions
 lined+0.002
 went+0.002
 fit+0.002
 started+0.002
 really+0.002
Neg logit contributions
 mistake-0.005
 lair-0.005
 property-0.005
lor-0.005
 privacy-0.005
Token
Token ID198
Feature activation+2.11e-01

Pos logit contributions
 lined+0.001
 went+0.001
 really+0.001
 fit+0.001
aved+0.001
Neg logit contributions
 lair-0.001
 mistake-0.001
 witches-0.001
 privacy-0.001
 property-0.001
Token test
Token ID1332
Feature activation-7.52e-01

Pos logit contributions
 fit+0.008
 lined+0.008
 really+0.007
 went+0.007
 started+0.007
Neg logit contributions
lor-0.013
 lair-0.012
 mistake-0.012
oral-0.011
reetings-0.011
Token.
Token ID13
Feature activation+1.53e-01

Pos logit contributions
 mistake+0.031
 lair+0.030
lor+0.030
oral+0.029
reetings+0.027
Neg logit contributions
 fit-0.015
 lined-0.014
 went-0.014
 started-0.013
 really-0.013
Token
Token ID198
Feature activation-3.05e-01

Pos logit contributions
cy+0.003
 lined+0.003
 fit+0.003
aved+0.003
 went+0.003
Neg logit contributions
 mistake-0.006
 lair-0.006
lor-0.006
 property-0.006
oral-0.006
Token
Token ID198
Feature activation+2.38e-01

Pos logit contributions
 mistake+0.012
 lair+0.011
lor+0.010
 property+0.010
oral+0.010
Neg logit contributions
 lined-0.007
 fit-0.007
cy-0.007
aved-0.007
 went-0.007
TokenWhen
Token ID2215
Feature activation+2.30e+00

Pos logit contributions
 fit+0.006
 lined+0.005
 went+0.005
cy+0.005
 really+0.005
Neg logit contributions
 mistake-0.009
 lair-0.009
 property-0.008
lor-0.008
 privacy-0.008
Token afternoon
Token ID6672
Feature activation+3.79e+00

Pos logit contributions
aved+0.060
 lined+0.059
 swapped+0.054
organ+0.054
 fit+0.054
Neg logit contributions
 mistake-0.081
oral-0.073
 lair-0.072
lor-0.071
 witches-0.069
Token came
Token ID1625
Feature activation+9.46e-01

Pos logit contributions
aved+0.125
important+0.118
organ+0.113
idget+0.113
attered+0.112
Neg logit contributions
 mistake-0.096
oral-0.093
 story-0.086
lor-0.085
 reading-0.078
Token,
Token ID11
Feature activation+9.77e-01

Pos logit contributions
cy+0.019
 went+0.019
 fit+0.018
 lined+0.018
aved+0.017
Neg logit contributions
 mistake-0.040
 lair-0.038
lor-0.037
oral-0.036
cod-0.036
Token Molly
Token ID30236
Feature activation-3.59e-01

Pos logit contributions
aved+0.028
 lined+0.027
 swapped+0.025
bo+0.024
organ+0.024
Neg logit contributions
 mistake-0.031
oral-0.028
 lair-0.027
 witches-0.027
lor-0.027
Token was
Token ID373
Feature activation-1.14e-01

Pos logit contributions
 mistake+0.016
 lair+0.016
oral+0.016
lor+0.015
 property+0.015
Neg logit contributions
 went-0.007
 started-0.006
 lined-0.006
 fit-0.006
 swapped-0.005
Token ready
Token ID3492
Feature activation-9.62e-01

Pos logit contributions
 mistake+0.005
 lair+0.005
lor+0.005
oral+0.005
 property+0.004
Neg logit contributions
 lined-0.002
 fit-0.002
cy-0.002
aved-0.002
 swapped-0.002
TokenThat
Token ID2504
Feature activation-2.86e-01

Pos logit contributions
 lined+0.020
cy+0.020
 fit+0.020
aved+0.020
 swapped+0.019
Neg logit contributions
 mistake-0.028
 lair-0.027
lor-0.026
oral-0.026
 property-0.024
Token man
Token ID582
Feature activation+5.71e-01

Pos logit contributions
 lair+0.010
 mistake+0.010
oral+0.010
lor+0.009
joy+0.009
Neg logit contributions
 went-0.007
 fit-0.007
 started-0.007
 lined-0.007
 really-0.007
Token is
Token ID318
Feature activation+9.46e-01

Pos logit contributions
 fit+0.017
 lined+0.017
cy+0.017
 went+0.017
 started+0.016
Neg logit contributions
 mistake-0.017
 lair-0.017
lor-0.016
oral-0.016
 property-0.014
Token bald
Token ID28273
Feature activation-1.56e-01

Pos logit contributions
 fit+0.020
 lined+0.020
 went+0.018
 really+0.018
cy+0.018
Neg logit contributions
 mistake-0.037
 lair-0.036
lor-0.036
oral-0.035
 property-0.033
Token",
Token ID1600
Feature activation+2.05e+00

Pos logit contributions
 mistake+0.005
oral+0.005
lor+0.005
 lair+0.005
 Majesty+0.005
Neg logit contributions
 fit-0.004
 lined-0.004
 went-0.004
cy-0.004
 really-0.004
Token her
Token ID607
Feature activation+3.78e+00

Pos logit contributions
aved+0.042
cy+0.041
 lined+0.040
 fit+0.038
 swapped+0.037
Neg logit contributions
 mistake-0.083
 lair-0.076
lor-0.074
oral-0.073
 witches-0.071
Token mum
Token ID25682
Feature activation+5.53e-01

Pos logit contributions
aved+0.120
 lined+0.114
 swapped+0.112
pool+0.107
organ+0.106
Neg logit contributions
 mistake-0.125
 lair-0.105
oral-0.097
 Majesty-0.097
joy-0.095
Token replied
Token ID8712
Feature activation-1.07e-01

Pos logit contributions
aved+0.010
cy+0.010
 lined+0.010
 fit+0.010
 swapped+0.009
Neg logit contributions
 mistake-0.023
 lair-0.021
lor-0.021
oral-0.020
 witches-0.019
Token.
Token ID13
Feature activation-1.30e-01

Pos logit contributions
 mistake+0.004
 lair+0.003
lor+0.003
oral+0.003
 reading+0.003
Neg logit contributions
aved-0.003
cy-0.003
 lined-0.003
 fit-0.003
 swapped-0.003
Token
Token ID198
Feature activation-3.47e-01

Pos logit contributions
 mistake+0.005
 lair+0.004
lor+0.004
oral+0.004
 property+0.004
Neg logit contributions
 lined-0.003
 went-0.003
 fit-0.003
cy-0.003
 started-0.003
Token
Token ID198
Feature activation-9.22e-02

Pos logit contributions
 mistake+0.015
 lair+0.015
 property+0.014
 privacy+0.014
lor+0.013
Neg logit contributions
 lined-0.006
 went-0.006
cy-0.006
 fit-0.006
 really-0.006
Token matter
Token ID2300
Feature activation-1.97e-01

Pos logit contributions
lor+0.031
 lair+0.031
 mistake+0.031
oral+0.030
 property+0.028
Neg logit contributions
 fit-0.013
cy-0.013
 lined-0.013
 went-0.012
aved-0.012
Token how
Token ID703
Feature activation+1.71e+00

Pos logit contributions
 lair+0.009
lor+0.009
oral+0.009
 Majesty+0.008
cod+0.008
Neg logit contributions
 Bobby-0.003
 really-0.003
 Tim-0.003
 fit-0.003
 went-0.003
Token distant
Token ID12899
Feature activation+2.10e+00

Pos logit contributions
 fit+0.032
 really+0.030
aved+0.029
 started+0.028
 went+0.028
Neg logit contributions
lor-0.075
 lair-0.074
 mistake-0.071
oral-0.067
land-0.066
Token something
Token ID1223
Feature activation+2.09e+00

Pos logit contributions
 fit+0.057
 went+0.055
 lined+0.054
 started+0.053
cy+0.052
Neg logit contributions
 lair-0.073
 mistake-0.072
lor-0.071
oral-0.070
cod-0.062
Token may
Token ID743
Feature activation+1.22e+00

Pos logit contributions
 fit+0.054
 lined+0.052
 went+0.051
 started+0.050
aved+0.049
Neg logit contributions
 lair-0.074
 mistake-0.073
lor-0.072
oral-0.069
 Majesty-0.063
Token seem
Token ID1283
Feature activation+3.75e+00

Pos logit contributions
 fit+0.024
 went+0.018
 really+0.018
 lined+0.018
 started+0.017
Neg logit contributions
lor-0.057
 lair-0.056
oral-0.053
 mistake-0.053
joy-0.052
Token,
Token ID11
Feature activation+4.12e-01

Pos logit contributions
 lined+0.079
 fit+0.078
cy+0.076
aved+0.076
 went+0.073
Neg logit contributions
 mistake-0.147
 lair-0.140
lor-0.137
oral-0.134
 property-0.127
Token it
Token ID340
Feature activation+7.12e-02

Pos logit contributions
 fit+0.009
 really+0.009
 went+0.008
 Bobby+0.008
 started+0.008
Neg logit contributions
 lair-0.017
lor-0.017
oral-0.017
 mistake-0.016
Offic-0.015
Token is
Token ID318
Feature activation+8.44e-01

Pos logit contributions
 fit+0.002
 went+0.002
 started+0.002
cy+0.002
 lined+0.002
Neg logit contributions
 lair-0.003
 mistake-0.003
lor-0.003
oral-0.003
joy-0.002
Token still
Token ID991
Feature activation-4.34e-03

Pos logit contributions
 fit+0.017
 really+0.016
 lined+0.015
 went+0.014
 started+0.014
Neg logit contributions
lor-0.038
 lair-0.036
oral-0.034
 mistake-0.034
 Majesty-0.033
Token attain
Token ID18188
Feature activation+2.22e-01

Pos logit contributions
lor+0.000
 lair+0.000
oral+0.000
 mistake+0.000
 Majesty+0.000
Neg logit contributions
 fit-0.000
 really-0.000
 went-0.000
 lined-0.000
 started-0.000
Token matter
Token ID2300
Feature activation+8.42e-02

Pos logit contributions
 lair+0.015
 mistake+0.015
lor+0.014
oral+0.014
 Majesty+0.013
Neg logit contributions
 fit-0.007
 lined-0.007
cy-0.007
 went-0.007
 started-0.006
Token how
Token ID703
Feature activation+1.91e+00

Pos logit contributions
 Tim+0.001
 Bobby+0.001
 really+0.001
 as+0.001
 went+0.001
Neg logit contributions
 lair-0.004
lor-0.004
oral-0.004
 mistake-0.003
 Majesty-0.003
Token adorable
Token ID29012
Feature activation+2.22e+00

Pos logit contributions
 fit+0.032
 really+0.031
 went+0.030
aved+0.030
 started+0.029
Neg logit contributions
 lair-0.085
lor-0.084
 mistake-0.082
oral-0.076
land-0.075
Token they
Token ID484
Feature activation+1.05e+00

Pos logit contributions
 fit+0.044
 went+0.043
 lined+0.043
 started+0.041
cy+0.040
Neg logit contributions
 lair-0.092
 mistake-0.091
lor-0.087
oral-0.086
 Majesty-0.080
Token may
Token ID743
Feature activation+7.43e-01

Pos logit contributions
 lined+0.019
 fit+0.018
cy+0.018
aved+0.018
 swapped+0.017
Neg logit contributions
 mistake-0.044
 lair-0.042
lor-0.041
oral-0.041
 property-0.039
Token seem
Token ID1283
Feature activation+3.74e+00

Pos logit contributions
 fit+0.015
 really+0.013
 went+0.013
 started+0.012
 lined+0.012
Neg logit contributions
lor-0.034
 lair-0.034
oral-0.032
 Majesty-0.032
 mistake-0.031
Token,
Token ID11
Feature activation+6.52e-01

Pos logit contributions
 fit+0.075
 lined+0.074
 went+0.069
cy+0.068
aved+0.067
Neg logit contributions
 mistake-0.153
 lair-0.150
lor-0.149
oral-0.144
 privacy-0.134
Token it
Token ID340
Feature activation+1.82e-01

Pos logit contributions
 fit+0.008
 went+0.008
 started+0.007
 really+0.007
 Bobby+0.006
Neg logit contributions
lor-0.033
 lair-0.033
oral-0.032
 mistake-0.032
reetings-0.031
Token's
Token ID338
Feature activation+4.89e-01

Pos logit contributions
 went+0.004
 started+0.004
 fit+0.004
cy+0.004
 lined+0.004
Neg logit contributions
 lair-0.007
 mistake-0.007
lor-0.007
oral-0.007
joy-0.006
Token best
Token ID1266
Feature activation-9.91e-02

Pos logit contributions
 fit+0.012
 lined+0.011
 went+0.011
 really+0.011
 started+0.011
Neg logit contributions
lor-0.018
 lair-0.018
 mistake-0.018
oral-0.017
 Majesty-0.016
Token to
Token ID284
Feature activation+1.54e-01

Pos logit contributions
oral+0.004
lor+0.004
cod+0.004
 lair+0.004
 mistake+0.004
Neg logit contributions
 fit-0.002
 went-0.002
 started-0.002
 lined-0.002
 really-0.002
Token stamp
Token ID17977
Feature activation+1.11e-01

Pos logit contributions
 fit+0.016
 went+0.013
 matched+0.013
 started+0.013
 really+0.013
Neg logit contributions
lor-0.026
 lair-0.026
 mistake-0.025
reetings-0.024
oral-0.024
Token every
Token ID790
Feature activation+1.36e-02

Pos logit contributions
 fit+0.003
 lined+0.003
aved+0.003
cy+0.003
 went+0.002
Neg logit contributions
 mistake-0.004
 lair-0.004
lor-0.004
oral-0.004
 property-0.003
Token day
Token ID1110
Feature activation+2.47e-01

Pos logit contributions
 fit+0.000
 went+0.000
cy+0.000
 lined+0.000
aved+0.000
Neg logit contributions
 mistake-0.001
 lair-0.001
lor-0.001
oral-0.001
 property-0.001
Token.
Token ID13
Feature activation+8.84e-02

Pos logit contributions
 lined+0.006
 fit+0.005
cy+0.005
aved+0.005
 swapped+0.005
Neg logit contributions
 mistake-0.009
 lair-0.009
lor-0.009
oral-0.008
 property-0.008
Token As
Token ID1081
Feature activation+1.49e+00

Pos logit contributions
 lined+0.001
 fit+0.001
cy+0.001
 went+0.001
aved+0.001
Neg logit contributions
 mistake-0.004
 lair-0.004
lor-0.004
oral-0.004
 property-0.004
Token time
Token ID640
Feature activation+3.74e+00

Pos logit contributions
 fit+0.020
 went+0.020
 lined+0.020
 started+0.018
cy+0.018
Neg logit contributions
 mistake-0.070
 lair-0.068
lor-0.068
oral-0.066
 property-0.065
Token passed
Token ID3804
Feature activation+1.34e+00

Pos logit contributions
cy+0.078
ishy+0.074
attered+0.073
idget+0.071
aved+0.070
Neg logit contributions
oral-0.125
 mistake-0.124
lor-0.118
 lair-0.116
 reading-0.115
Token,
Token ID11
Feature activation+1.11e+00

Pos logit contributions
aved+0.030
 lined+0.030
cy+0.029
 fit+0.029
 swapped+0.029
Neg logit contributions
 mistake-0.050
 lair-0.047
lor-0.046
oral-0.045
 property-0.043
Token he
Token ID339
Feature activation-3.81e-01

Pos logit contributions
 lined+0.021
 fit+0.020
cy+0.020
aved+0.020
 swapped+0.019
Neg logit contributions
 mistake-0.046
 lair-0.044
lor-0.043
oral-0.042
 property-0.040
Token started
Token ID2067
Feature activation+1.24e+00

Pos logit contributions
 mistake+0.018
oral+0.017
lor+0.017
 lair+0.017
 property+0.017
Neg logit contributions
 went-0.007
 started-0.007
 fit-0.006
 lined-0.006
 got-0.006
Token to
Token ID284
Feature activation-5.71e-01

Pos logit contributions
 lined+0.028
 fit+0.027
 went+0.027
 started+0.027
 swapped+0.025
Neg logit contributions
 mistake-0.047
lor-0.047
oral-0.046
 lair-0.045
reetings-0.044
Token
Token ID198
Feature activation-1.91e-01

Pos logit contributions
 mistake+0.018
 lair+0.017
 property+0.016
 Majesty+0.016
 privacy+0.016
Neg logit contributions
 lined-0.011
 went-0.011
 really-0.011
 fit-0.011
aved-0.010
TokenBut
Token ID1537
Feature activation+6.50e-01

Pos logit contributions
 mistake+0.008
 lair+0.008
 Majesty+0.007
 property+0.007
 privacy+0.007
Neg logit contributions
 went-0.004
 really-0.004
 lined-0.004
 started-0.004
 fit-0.003
Token after
Token ID706
Feature activation+2.16e-01

Pos logit contributions
 started+0.009
 went+0.009
 fit+0.009
 lined+0.009
cy+0.009
Neg logit contributions
 mistake-0.030
lor-0.030
 lair-0.030
oral-0.029
 property-0.027
Token a
Token ID257
Feature activation+1.74e+00

Pos logit contributions
 fit+0.005
 lined+0.005
 went+0.004
 Derek+0.004
 really+0.004
Neg logit contributions
lor-0.008
 mistake-0.008
 lair-0.008
oral-0.008
 property-0.008
Token few
Token ID1178
Feature activation+5.80e-01

Pos logit contributions
aved+0.046
cy+0.043
 lined+0.043
 swapped+0.041
 Derek+0.040
Neg logit contributions
 mistake-0.063
oral-0.056
 lair-0.054
lor-0.052
 witch-0.052
Token turns
Token ID4962
Feature activation+3.74e+00

Pos logit contributions
 fit+0.018
 lined+0.018
 went+0.017
 started+0.017
cy+0.017
Neg logit contributions
lor-0.017
 mistake-0.017
 lair-0.017
oral-0.015
 property-0.015
Token,
Token ID11
Feature activation+1.25e+00

Pos logit contributions
cy+0.089
aved+0.086
organ+0.080
 swapped+0.078
bo+0.077
Neg logit contributions
 mistake-0.135
 lair-0.124
oral-0.117
lor-0.117
 witches-0.115
Token Charlie
Token ID11526
Feature activation+3.09e-01

Pos logit contributions
 fit+0.025
 went+0.024
 started+0.024
 lined+0.024
cy+0.023
Neg logit contributions
 mistake-0.052
lor-0.049
 lair-0.049
oral-0.047
land-0.045
Token got
Token ID1392
Feature activation+2.26e-01

Pos logit contributions
 lined+0.006
 fit+0.006
cy+0.006
aved+0.006
 went+0.005
Neg logit contributions
 mistake-0.013
 lair-0.012
lor-0.012
oral-0.012
 property-0.011
Token impatient
Token ID33440
Feature activation+7.64e-01

Pos logit contributions
cy+0.006
pool+0.006
aved+0.006
suit+0.006
oun+0.005
Neg logit contributions
 mistake-0.008
 lair-0.007
joy-0.007
 reading-0.007
 mistakes-0.006
Token and
Token ID290
Feature activation-3.69e-01

Pos logit contributions
 lined+0.022
aved+0.022
cy+0.022
 fit+0.022
 swapped+0.021
Neg logit contributions
 mistake-0.024
 lair-0.022
lor-0.021
oral-0.021
 property-0.020
Token
Token ID198
Feature activation-3.11e-01

Pos logit contributions
 mistake+0.028
 lair+0.027
lor+0.026
oral+0.025
 property+0.025
Neg logit contributions
 lined-0.018
 fit-0.018
cy-0.018
aved-0.017
 went-0.017
TokenBut
Token ID1537
Feature activation+9.40e-02

Pos logit contributions
 mistake+0.009
 lair+0.008
lor+0.008
oral+0.008
 property+0.007
Neg logit contributions
cy-0.010
 lined-0.010
aved-0.010
 fit-0.010
 swapped-0.010
Token as
Token ID355
Feature activation+2.01e+00

Pos logit contributions
aved+0.002
 lined+0.002
cy+0.002
 fit+0.002
 swapped+0.002
Neg logit contributions
 mistake-0.004
 lair-0.004
lor-0.003
oral-0.003
 property-0.003
Token the
Token ID262
Feature activation+9.56e-01

Pos logit contributions
aved+0.046
 swapped+0.045
 lined+0.043
cy+0.042
pool+0.041
Neg logit contributions
 mistake-0.075
 lair-0.065
oral-0.065
 witches-0.064
 remembering-0.063
Token sun
Token ID4252
Feature activation+2.19e+00

Pos logit contributions
 lined+0.021
 fit+0.021
cy+0.021
aved+0.021
 swapped+0.020
Neg logit contributions
 mistake-0.036
 lair-0.034
lor-0.033
oral-0.033
 property-0.031
Token started
Token ID2067
Feature activation+3.73e+00

Pos logit contributions
aved+0.039
 Derek+0.036
 swapped+0.035
pool+0.035
 Drew+0.034
Neg logit contributions
 mistake-0.082
lor-0.078
oral-0.077
 lair-0.076
 witch-0.076
Token to
Token ID284
Feature activation+1.58e+00

Pos logit contributions
cy+0.118
 swapped+0.115
aved+0.112
 lined+0.111
organ+0.109
Neg logit contributions
 mistake-0.104
 lair-0.096
oral-0.087
 witches-0.087
lor-0.083
Token go
Token ID467
Feature activation+1.27e+00

Pos logit contributions
 lined+0.025
aved+0.025
cy+0.025
 fit+0.024
 swapped+0.024
Neg logit contributions
 mistake-0.070
 lair-0.065
lor-0.064
oral-0.063
 property-0.060
Token down
Token ID866
Feature activation+1.20e+00

Pos logit contributions
 swapped+0.028
aved+0.026
 lined+0.025
 Derek+0.024
 matched+0.024
Neg logit contributions
 mistake-0.046
oral-0.041
 witches-0.040
 witch-0.040
 property-0.039
Token,
Token ID11
Feature activation+1.08e+00

Pos logit contributions
 swapped+0.031
aved+0.030
cy+0.030
 lined+0.030
 fit+0.028
Neg logit contributions
 mistake-0.040
 lair-0.037
oral-0.036
lor-0.034
 witches-0.034
Token Tim
Token ID5045
Feature activation+4.04e-02

Pos logit contributions
cy+0.018
aved+0.018
 lined+0.017
 swapped+0.016
inn+0.016
Neg logit contributions
 mistake-0.047
 lair-0.043
oral-0.042
 witches-0.042
lor-0.041
Token I
Token ID314
Feature activation+1.11e+00

Pos logit contributions
 lair+0.011
 mistake+0.010
lor+0.009
 property+0.009
oral+0.009
Neg logit contributions
 fit-0.008
 went-0.007
 really-0.007
 started-0.007
 Tim-0.007
Token pick
Token ID2298
Feature activation+3.70e-01

Pos logit contributions
 lined+0.024
aved+0.023
cy+0.023
 fit+0.022
 swapped+0.022
Neg logit contributions
 mistake-0.043
 lair-0.040
lor-0.039
oral-0.039
 property-0.037
Token it
Token ID340
Feature activation+1.78e+00

Pos logit contributions
 lined+0.009
 fit+0.009
cy+0.009
aved+0.009
 swapped+0.009
Neg logit contributions
 mistake-0.013
 lair-0.012
lor-0.012
oral-0.012
 property-0.011
Token please
Token ID3387
Feature activation-5.26e-01

Pos logit contributions
 fit+0.042
cy+0.041
 lined+0.039
 went+0.039
aved+0.037
Neg logit contributions
 mistake-0.068
 lair-0.066
lor-0.062
oral-0.062
 privacy-0.059
Token?"
Token ID1701
Feature activation+1.66e+00

Pos logit contributions
 lair+0.019
 mistake+0.017
lor+0.017
 witches+0.016
oral+0.016
Neg logit contributions
 fit-0.016
 really-0.015
 went-0.015
 lined-0.014
cy-0.014
Token Her
Token ID2332
Feature activation+3.73e+00

Pos logit contributions
aved+0.043
 lined+0.043
 fit+0.042
cy+0.041
 swapped+0.040
Neg logit contributions
 mistake-0.057
 lair-0.053
lor-0.052
oral-0.051
 property-0.047
Token Mum
Token ID38367
Feature activation+8.18e-01

Pos logit contributions
aved+0.099
 lined+0.090
pool+0.086
organ+0.085
 swapped+0.085
Neg logit contributions
 mistake-0.140
 lair-0.122
oral-0.117
 Majesty-0.113
joy-0.113
Token looked
Token ID3114
Feature activation+3.21e-01

Pos logit contributions
aved+0.019
pool+0.017
 lined+0.017
organ+0.016
attered+0.016
Neg logit contributions
 mistake-0.030
 lair-0.026
lor-0.026
oral-0.026
 reading-0.025
Token at
Token ID379
Feature activation-1.99e-01

Pos logit contributions
 lined+0.007
aved+0.007
cy+0.007
 fit+0.007
 swapped+0.007
Neg logit contributions
 mistake-0.012
 lair-0.011
lor-0.011
oral-0.011
 property-0.010
Token the
Token ID262
Feature activation-3.94e-01

Pos logit contributions
lor+0.009
 lair+0.009
 mistake+0.009
oral+0.009
reetings+0.009
Neg logit contributions
cy-0.003
 fit-0.003
 lined-0.003
 went-0.002
 started-0.002
Token price
Token ID2756
Feature activation-5.77e-01

Pos logit contributions
 mistake+0.017
oral+0.017
 lair+0.016
lor+0.016
reetings+0.016
Neg logit contributions
 fit-0.007
 went-0.006
cy-0.006
 really-0.006
 started-0.006
Token They
Token ID1119
Feature activation-2.00e-01

Pos logit contributions
 mistake+0.024
 lair+0.022
lor+0.021
 property+0.021
oral+0.021
Neg logit contributions
 lined-0.008
cy-0.008
 went-0.008
 Derek-0.008
 Tim-0.008
Token find
Token ID1064
Feature activation-1.41e+00

Pos logit contributions
 mistake+0.009
 lair+0.008
lor+0.008
oral+0.008
 property+0.008
Neg logit contributions
 lined-0.004
 fit-0.004
aved-0.003
 went-0.003
cy-0.003
Token the
Token ID262
Feature activation-4.99e-01

Pos logit contributions
 mistake+0.060
lor+0.057
 lair+0.057
oral+0.056
reetings+0.052
Neg logit contributions
 fit-0.026
 lined-0.026
aved-0.025
cy-0.024
 went-0.024
Token rh
Token ID9529
Feature activation-2.56e+00

Pos logit contributions
lor+0.020
reetings+0.018
oral+0.018
 mistake+0.018
 lair+0.018
Neg logit contributions
 fit-0.012
 lined-0.010
 went-0.010
 really-0.010
 started-0.010
Tokenin
Token ID259
Feature activation-1.52e+00

Pos logit contributions
 mistake+0.115
 lair+0.109
 Majesty+0.106
lor+0.105
oral+0.101
Neg logit contributions
cy-0.043
 fit-0.037
 lined-0.036
aved-0.036
bo-0.034
Tokenoc
Token ID420
Feature activation-4.19e+00

Pos logit contributions
 mistake+0.042
 lair+0.040
lor+0.039
oral+0.037
 Majesty+0.037
Neg logit contributions
 fit-0.051
cy-0.050
 went-0.049
 lined-0.049
 started-0.047
Tokeneros
Token ID27498
Feature activation-6.50e-01

Pos logit contributions
 mistake+0.103
 lair+0.092
lor+0.090
oral+0.089
land+0.084
Neg logit contributions
cy-0.146
 lined-0.139
aved-0.137
 swapped-0.134
 fit-0.133
Token.
Token ID13
Feature activation+1.91e-01

Pos logit contributions
oral+0.028
lor+0.028
 mistake+0.027
 lair+0.027
 Majesty+0.026
Neg logit contributions
 went-0.013
 started-0.012
 fit-0.012
 lined-0.011
 got-0.011
Token They
Token ID1119
Feature activation+1.14e-02

Pos logit contributions
 lined+0.004
cy+0.004
 fit+0.004
aved+0.004
 swapped+0.003
Neg logit contributions
 mistake-0.008
 lair-0.007
lor-0.007
oral-0.007
 property-0.007
Token are
Token ID389
Feature activation-7.07e-01

Pos logit contributions
 lined+0.000
 fit+0.000
aved+0.000
cy+0.000
 went+0.000
Neg logit contributions
 mistake-0.000
lor-0.000
 lair-0.000
oral-0.000
 property-0.000
Token gentle
Token ID10296
Feature activation-1.54e+00

Pos logit contributions
 mistake+0.028
lor+0.027
 lair+0.027
oral+0.026
 property+0.025
Neg logit contributions
 fit-0.015
 lined-0.015
aved-0.014
cy-0.014
 swapped-0.014
Token turned
Token ID2900
Feature activation+5.22e-01

Pos logit contributions
 fit+0.007
 lined+0.006
 went+0.006
aved+0.006
cy+0.006
Neg logit contributions
 mistake-0.020
 lair-0.020
lor-0.019
oral-0.019
 property-0.018
Token out
Token ID503
Feature activation-3.12e-01

Pos logit contributions
 lined+0.008
cy+0.008
aved+0.007
 fit+0.007
 swapped+0.007
Neg logit contributions
 mistake-0.024
 lair-0.022
lor-0.022
oral-0.021
 property-0.021
Token that
Token ID326
Feature activation-2.36e-01

Pos logit contributions
lor+0.014
 lair+0.013
 mistake+0.013
oral+0.013
 Majesty+0.012
Neg logit contributions
 fit-0.006
 went-0.006
 started-0.006
 really-0.006
 lined-0.005
Token the
Token ID262
Feature activation-3.41e-01

Pos logit contributions
lor+0.011
 lair+0.011
 mistake+0.011
oral+0.011
 Majesty+0.010
Neg logit contributions
 fit-0.004
 went-0.004
 started-0.003
 lined-0.003
 matched-0.003
Token av
Token ID1196
Feature activation-8.88e-01

Pos logit contributions
lor+0.013
 mistake+0.012
 lair+0.012
oral+0.011
reetings+0.011
Neg logit contributions
 fit-0.009
 went-0.008
 started-0.008
cy-0.008
 really-0.008
Tokenoc
Token ID420
Feature activation-4.17e+00

Pos logit contributions
 mistake+0.017
 lair+0.015
lor+0.014
oral+0.013
 property+0.012
Neg logit contributions
 lined-0.037
 fit-0.036
aved-0.036
cy-0.036
 swapped-0.036
Tokenados
Token ID22484
Feature activation+6.35e-01

Pos logit contributions
 mistake+0.151
 lair+0.144
lor+0.138
land+0.136
oral+0.135
Neg logit contributions
cy-0.093
bo-0.084
aved-0.083
 went-0.082
 lined-0.082
Token were
Token ID547
Feature activation+8.52e-01

Pos logit contributions
 fit+0.007
 went+0.007
 lined+0.007
 started+0.007
aved+0.006
Neg logit contributions
 mistake-0.032
 lair-0.032
lor-0.031
oral-0.031
 Majesty-0.030
Token bad
Token ID2089
Feature activation+7.92e-01

Pos logit contributions
 fit+0.019
 started+0.017
 went+0.017
 really+0.017
 lined+0.017
Neg logit contributions
lor-0.035
oral-0.034
 mistake-0.033
 lair-0.033
 Majesty-0.031
Token and
Token ID290
Feature activation+1.39e-01

Pos logit contributions
 went+0.018
 started+0.017
 fit+0.017
 lined+0.017
 really+0.016
Neg logit contributions
lor-0.031
oral-0.030
 mistake-0.029
 lair-0.029
reetings-0.027
Token made
Token ID925
Feature activation+9.37e-02

Pos logit contributions
 fit+0.003
 went+0.003
 started+0.003
 lined+0.002
 swapped+0.002
Neg logit contributions
lor-0.006
oral-0.006
 lair-0.006
 mistake-0.006
land-0.005
Token had
Token ID550
Feature activation-1.69e+00

Pos logit contributions
 mistake+0.015
 lair+0.014
oral+0.014
lor+0.014
 privacy+0.013
Neg logit contributions
cy-0.013
aved-0.012
 lined-0.012
inn-0.012
pool-0.012
Token discovered
Token ID5071
Feature activation-5.14e-01

Pos logit contributions
 mistake+0.064
lor+0.062
 lair+0.061
oral+0.059
reetings+0.057
Neg logit contributions
 lined-0.038
 fit-0.038
 went-0.038
 swapped-0.038
 started-0.036
Token the
Token ID262
Feature activation-7.59e-01

Pos logit contributions
 mistake+0.021
 lair+0.020
lor+0.020
oral+0.019
 property+0.018
Neg logit contributions
 fit-0.010
 lined-0.010
cy-0.010
aved-0.009
 went-0.009
Token rh
Token ID9529
Feature activation-2.75e+00

Pos logit contributions
lor+0.028
 lair+0.024
 mistake+0.024
reetings+0.023
ently+0.023
Neg logit contributions
 fit-0.020
 lined-0.018
 really-0.018
 went-0.018
cy-0.017
Tokenin
Token ID259
Feature activation-1.99e+00

Pos logit contributions
 mistake+0.124
 lair+0.119
 Majesty+0.116
lor+0.114
 privacy+0.110
Neg logit contributions
cy-0.044
aved-0.038
 fit-0.038
 lined-0.037
bo-0.034
Tokenoc
Token ID420
Feature activation-4.11e+00

Pos logit contributions
 mistake+0.060
lor+0.057
 lair+0.056
 Majesty+0.054
 yourselves+0.052
Neg logit contributions
 fit-0.059
 went-0.059
cy-0.059
 lined-0.058
 really-0.056
Tokeneros
Token ID27498
Feature activation-9.22e-01

Pos logit contributions
 mistake+0.100
 lair+0.093
lor+0.090
oral+0.085
 witches+0.083
Neg logit contributions
cy-0.142
 lined-0.136
aved-0.134
 fit-0.130
 swapped-0.129
Token,
Token ID11
Feature activation-8.91e-01

Pos logit contributions
oral+0.036
lor+0.035
 mistake+0.035
 lair+0.034
 Majesty+0.033
Neg logit contributions
 went-0.021
 fit-0.021
 started-0.019
 really-0.019
 lined-0.018
Token but
Token ID475
Feature activation-9.29e-01

Pos logit contributions
lor+0.038
oral+0.037
reetings+0.035
 Majesty+0.034
 lair+0.034
Neg logit contributions
 fit-0.020
 really-0.018
 went-0.018
 Tim-0.018
 lined-0.017
Token they
Token ID484
Feature activation-2.60e-01

Pos logit contributions
lor+0.046
 lair+0.045
 mistake+0.044
oral+0.043
reetings+0.041
Neg logit contributions
 fit-0.013
 lined-0.011
 went-0.011
 started-0.010
 really-0.010
Token had
Token ID550
Feature activation-1.32e+00

Pos logit contributions
 mistake+0.011
 lair+0.010
lor+0.010
oral+0.010
 property+0.010
Neg logit contributions
 lined-0.005
 fit-0.005
cy-0.005
aved-0.005
 swapped-0.004
TokenLook
Token ID8567
Feature activation-1.11e-01

Pos logit contributions
 lined+0.023
 fit+0.022
aved+0.022
cy+0.022
 swapped+0.021
Neg logit contributions
 mistake-0.035
 lair-0.033
lor-0.033
oral-0.032
 property-0.030
Token,
Token ID11
Feature activation+3.60e-01

Pos logit contributions
 mistake+0.004
 lair+0.004
lor+0.003
oral+0.003
 Majesty+0.003
Neg logit contributions
aved-0.003
 lined-0.003
 fit-0.003
 swapped-0.003
cy-0.003
Token James
Token ID3700
Feature activation-2.43e-02

Pos logit contributions
 fit+0.007
 lined+0.007
cy+0.006
 went+0.006
 swapped+0.006
Neg logit contributions
 mistake-0.015
 lair-0.015
lor-0.014
oral-0.014
 property-0.013
Token!
Token ID0
Feature activation-6.09e-01

Pos logit contributions
 mistake+0.001
 lair+0.001
oral+0.001
lor+0.001
 privacy+0.001
Neg logit contributions
cy-0.001
 fit-0.001
 lined-0.001
 went-0.001
 started-0.001
Token An
Token ID1052
Feature activation-5.32e-01

Pos logit contributions
 mistake+0.025
 lair+0.022
oral+0.021
lor+0.021
 privacy+0.021
Neg logit contributions
cy-0.014
 lined-0.013
 went-0.013
 fit-0.013
aved-0.013
Token aer
Token ID9551
Feature activation-4.07e+00

Pos logit contributions
 mistake+0.016
 lair+0.015
lor+0.015
oral+0.014
 property+0.014
Neg logit contributions
 lined-0.016
 fit-0.016
cy-0.015
aved-0.015
 went-0.015
Tokenopl
Token ID20106
Feature activation-2.62e+00

Pos logit contributions
 mistake+0.123
 lair+0.117
 privacy+0.106
 wounds+0.106
joy+0.105
Neg logit contributions
 lined-0.116
bo-0.114
aved-0.111
 really-0.108
whe-0.108
Tokenane
Token ID1531
Feature activation-9.53e-01

Pos logit contributions
 mistake+0.108
 lair+0.099
 Majesty+0.087
 property+0.087
 privacy+0.087
Neg logit contributions
aved-0.057
cy-0.057
 lined-0.056
 went-0.053
pool-0.050
Token!"
Token ID2474
Feature activation-1.71e-01

Pos logit contributions
 mistake+0.032
 lair+0.031
oral+0.031
joy+0.030
lor+0.029
Neg logit contributions
 went-0.030
 fit-0.030
 started-0.029
 lined-0.028
 got-0.027
Token said
Token ID531
Feature activation+8.54e-01

Pos logit contributions
 mistake+0.007
 lair+0.006
lor+0.006
oral+0.006
 privacy+0.006
Neg logit contributions
aved-0.004
cy-0.004
 lined-0.003
 fit-0.003
 swapped-0.003
Token Dad
Token ID17415
Feature activation+4.87e-01

Pos logit contributions
aved+0.023
cy+0.021
pool+0.021
ishy+0.021
attered+0.020
Neg logit contributions
 mistake-0.029
 lair-0.026
 reading-0.024
lor-0.024
 aid-0.023
Token It
Token ID632
Feature activation+6.79e-01

Pos logit contributions
 fit+0.005
 lined+0.005
 went+0.005
 Derek+0.005
 O+0.005
Neg logit contributions
 mistake-0.021
 lair-0.020
 property-0.019
lor-0.019
 privacy-0.018
Token was
Token ID373
Feature activation+5.29e-02

Pos logit contributions
aved+0.014
 lined+0.014
cy+0.014
 swapped+0.013
 fit+0.013
Neg logit contributions
 mistake-0.026
 lair-0.025
lor-0.024
oral-0.023
 property-0.022
Token the
Token ID262
Feature activation-1.39e+00

Pos logit contributions
 fit+0.001
 lined+0.001
 went+0.001
aved+0.001
cy+0.001
Neg logit contributions
 mistake-0.002
lor-0.002
 lair-0.002
oral-0.002
 property-0.002
Token rh
Token ID9529
Feature activation-2.56e+00

Pos logit contributions
lor+0.051
 mistake+0.050
oral+0.048
 lair+0.047
reetings+0.045
Neg logit contributions
 fit-0.033
 lined-0.031
 went-0.030
cy-0.029
 started-0.029
Tokenin
Token ID259
Feature activation-1.66e+00

Pos logit contributions
 mistake+0.118
 lair+0.112
lor+0.107
 Majesty+0.106
oral+0.105
Neg logit contributions
cy-0.041
 fit-0.039
aved-0.037
 lined-0.035
bo-0.033
Tokenoc
Token ID420
Feature activation-4.05e+00

Pos logit contributions
 mistake+0.061
 lair+0.059
lor+0.058
oral+0.058
 privacy+0.054
Neg logit contributions
 fit-0.042
 went-0.039
 lined-0.038
cy-0.038
 started-0.037
Tokeneros
Token ID27498
Feature activation-7.37e-01

Pos logit contributions
 mistake+0.087
 lair+0.078
lor+0.075
oral+0.073
 privacy+0.065
Neg logit contributions
cy-0.154
 lined-0.154
 fit-0.152
aved-0.152
 went-0.148
Token.
Token ID13
Feature activation+4.50e-01

Pos logit contributions
 mistake+0.029
oral+0.029
lor+0.029
 lair+0.028
reetings+0.027
Neg logit contributions
 fit-0.016
 went-0.015
 lined-0.014
cy-0.014
 started-0.014
Token It
Token ID632
Feature activation+6.05e-01

Pos logit contributions
 fit+0.006
 lined+0.006
 Derek+0.006
 went+0.006
 Lil+0.006
Neg logit contributions
 mistake-0.022
 lair-0.021
 property-0.020
lor-0.019
 privacy-0.019
Token lifted
Token ID13663
Feature activation-2.94e-01

Pos logit contributions
 lined+0.011
cy+0.011
aved+0.011
 fit+0.011
 swapped+0.010
Neg logit contributions
 mistake-0.025
 lair-0.024
lor-0.023
oral-0.023
 property-0.022
Token its
Token ID663
Feature activation+5.06e-01

Pos logit contributions
 mistake+0.011
 lair+0.010
lor+0.010
oral+0.009
 property+0.009
Neg logit contributions
 lined-0.007
aved-0.007
 fit-0.007
cy-0.007
 swapped-0.007
Tokeneros
Token ID27498
Feature activation-4.74e-01

Pos logit contributions
 mistake+0.103
oral+0.090
lor+0.088
 lair+0.085
reetings+0.084
Neg logit contributions
cy-0.113
 lined-0.110
aved-0.109
bo-0.105
 went-0.104
Token.
Token ID13
Feature activation-2.14e-01

Pos logit contributions
 mistake+0.023
oral+0.022
lor+0.022
 Majesty+0.022
reetings+0.021
Neg logit contributions
 went-0.008
 started-0.007
 really-0.006
 got-0.006
 lined-0.006
Token The
Token ID383
Feature activation-1.04e+00

Pos logit contributions
 mistake+0.010
 lair+0.010
lor+0.009
 property+0.009
oral+0.009
Neg logit contributions
 lined-0.003
 went-0.003
 fit-0.003
cy-0.003
 Tim-0.003
Token rh
Token ID9529
Feature activation-2.44e+00

Pos logit contributions
lor+0.042
 mistake+0.041
 lair+0.040
oral+0.038
reetings+0.037
Neg logit contributions
 fit-0.021
 went-0.020
 lined-0.019
 really-0.019
cy-0.018
Tokenin
Token ID259
Feature activation-1.33e+00

Pos logit contributions
 mistake+0.112
 lair+0.108
lor+0.106
 Majesty+0.103
land+0.103
Neg logit contributions
cy-0.036
 fit-0.030
 lined-0.029
bo-0.029
 really-0.028
Tokenoc
Token ID420
Feature activation-4.05e+00

Pos logit contributions
 mistake+0.039
 lair+0.037
lor+0.035
 witches+0.034
 privacy+0.033
Neg logit contributions
 went-0.042
cy-0.041
 fit-0.040
bo-0.040
 lined-0.040
Tokeneros
Token ID27498
Feature activation-5.18e-01

Pos logit contributions
 mistake+0.108
lor+0.096
 lair+0.095
oral+0.094
land+0.091
Neg logit contributions
cy-0.129
 lined-0.123
aved-0.120
bo-0.119
 went-0.119
Token was
Token ID373
Feature activation+9.39e-03

Pos logit contributions
 mistake+0.025
lor+0.024
 lair+0.024
oral+0.023
 conclusions+0.022
Neg logit contributions
 went-0.008
 started-0.008
 lined-0.008
 fit-0.007
 really-0.007
Token very
Token ID845
Feature activation-1.08e+00

Pos logit contributions
 fit+0.000
 lined+0.000
aved+0.000
cy+0.000
 went+0.000
Neg logit contributions
 mistake-0.000
 lair-0.000
lor-0.000
oral-0.000
 property-0.000
Token big
Token ID1263
Feature activation-7.33e-02

Pos logit contributions
 mistake+0.040
 lair+0.038
lor+0.037
oral+0.036
 property+0.034
Neg logit contributions
 lined-0.025
 fit-0.025
cy-0.025
aved-0.024
 swapped-0.024
Token and
Token ID290
Feature activation-4.59e-01

Pos logit contributions
lor+0.003
oral+0.003
reetings+0.003
 mistake+0.003
 conclusions+0.003
Neg logit contributions
 went-0.001
 really-0.001
 fit-0.001
 started-0.001
 lined-0.001
Token Did
Token ID7731
Feature activation-1.14e+00

Pos logit contributions
 mistake+0.028
 lair+0.026
lor+0.024
oral+0.023
 property+0.022
Neg logit contributions
cy-0.030
 lined-0.030
 fit-0.029
aved-0.029
 swapped-0.029
Token you
Token ID345
Feature activation+4.59e-01

Pos logit contributions
 mistake+0.032
 lair+0.031
lor+0.030
oral+0.029
 property+0.027
Neg logit contributions
 fit-0.037
 lined-0.036
cy-0.035
 went-0.035
 swapped-0.034
Token know
Token ID760
Feature activation-8.64e-01

Pos logit contributions
 fit+0.009
 lined+0.009
aved+0.008
 went+0.008
 really+0.008
Neg logit contributions
 mistake-0.019
 lair-0.019
lor-0.019
oral-0.018
joy-0.017
Token that
Token ID326
Feature activation-1.31e-01

Pos logit contributions
oral+0.039
lor+0.039
 lair+0.039
 mistake+0.036
joy+0.036
Neg logit contributions
 really-0.016
 fit-0.015
 started-0.015
 both-0.014
 Tim-0.014
Token av
Token ID1196
Feature activation-6.55e-01

Pos logit contributions
lor+0.005
oral+0.005
 lair+0.005
 mistake+0.005
joy+0.005
Neg logit contributions
 fit-0.003
 went-0.003
 started-0.003
 really-0.003
 matched-0.003
Tokenoc
Token ID420
Feature activation-4.05e+00

Pos logit contributions
 mistake+0.011
 lair+0.010
lor+0.009
oral+0.009
 property+0.008
Neg logit contributions
 lined-0.028
 fit-0.028
aved-0.028
cy-0.027
 swapped-0.027
Tokenados
Token ID22484
Feature activation+9.20e-01

Pos logit contributions
 mistake+0.148
 lair+0.142
lor+0.135
land+0.133
oral+0.130
Neg logit contributions
 lined-0.086
cy-0.086
aved-0.083
organ-0.081
 went-0.081
Token are
Token ID389
Feature activation+2.01e+00

Pos logit contributions
 fit+0.019
 lined+0.018
 went+0.017
cy+0.017
aved+0.017
Neg logit contributions
 mistake-0.038
 lair-0.037
lor-0.036
oral-0.035
joy-0.033
Token very
Token ID845
Feature activation-1.63e-01

Pos logit contributions
 fit+0.048
 really+0.046
 lined+0.043
 went+0.042
 started+0.041
Neg logit contributions
lor-0.080
 lair-0.075
 mistake-0.075
oral-0.071
joy-0.071
Token healthy
Token ID5448
Feature activation+5.74e-02

Pos logit contributions
 mistake+0.007
 lair+0.007
lor+0.007
oral+0.006
joy+0.006
Neg logit contributions
 fit-0.003
 lined-0.003
 really-0.003
 went-0.003
 started-0.003
Token for
Token ID329
Feature activation-1.69e-01

Pos logit contributions
 fit+0.001
 went+0.001
 really+0.001
 lined+0.001
cy+0.001
Neg logit contributions
lor-0.002
 mistake-0.002
oral-0.002
 lair-0.002
 Majesty-0.002
Token playing
Token ID2712
Feature activation+7.87e-02

Pos logit contributions
 lined+0.004
 went+0.004
 started+0.004
 fit+0.004
aved+0.004
Neg logit contributions
oral-0.009
lor-0.009
 mistake-0.009
 lair-0.008
 privacy-0.008
Token!
Token ID0
Feature activation-3.73e-01

Pos logit contributions
 fit+0.001
 lined+0.001
 went+0.001
 really+0.001
 started+0.001
Neg logit contributions
lor-0.004
oral-0.004
 mistake-0.004
 lair-0.004
 Majesty-0.003
Token The
Token ID383
Feature activation-3.97e-01

Pos logit contributions
 mistake+0.017
 lair+0.016
lor+0.015
 property+0.015
 privacy+0.015
Neg logit contributions
 lined-0.006
cy-0.006
 went-0.006
 swapped-0.006
 Derek-0.006
Token rh
Token ID9529
Feature activation-2.35e+00

Pos logit contributions
lor+0.014
 lair+0.013
 mistake+0.013
oral+0.012
reetings+0.012
Neg logit contributions
 fit-0.011
 lined-0.010
 went-0.010
cy-0.010
 started-0.010
Tokenin
Token ID259
Feature activation-1.34e+00

Pos logit contributions
 mistake+0.113
 lair+0.112
 Majesty+0.108
lor+0.107
brance+0.106
Neg logit contributions
cy-0.030
 fit-0.022
 lined-0.021
aved-0.020
bo-0.019
Tokenoc
Token ID420
Feature activation-4.02e+00

Pos logit contributions
 mistake+0.037
 lair+0.037
lor+0.035
 witches+0.034
 privacy+0.034
Neg logit contributions
 went-0.044
cy-0.044
 started-0.042
 fit-0.042
bo-0.042
Tokeneros
Token ID27498
Feature activation-6.40e-01

Pos logit contributions
 mistake+0.101
 lair+0.092
lor+0.090
oral+0.090
land+0.086
Neg logit contributions
cy-0.136
 lined-0.128
aved-0.128
 swapped-0.124
 went-0.124
Token was
Token ID373
Feature activation+1.98e-01

Pos logit contributions
 mistake+0.029
lor+0.029
 lair+0.029
oral+0.029
 privacy+0.027
Neg logit contributions
 went-0.011
 started-0.011
 lined-0.010
 fit-0.010
 really-0.009
Token very
Token ID845
Feature activation-8.63e-01

Pos logit contributions
 lined+0.003
 fit+0.003
aved+0.003
 swapped+0.003
 went+0.003
Neg logit contributions
 mistake-0.009
lor-0.008
oral-0.008
 lair-0.008
 privacy-0.008
Token worried
Token ID7960
Feature activation-3.70e-01

Pos logit contributions
 mistake+0.033
 lair+0.030
oral+0.029
lor+0.029
 property+0.028
Neg logit contributions
cy-0.019
aved-0.019
 lined-0.019
 fit-0.018
 swapped-0.018
Token,
Token ID11
Feature activation-1.00e+00

Pos logit contributions
lor+0.015
 lair+0.015
 mistake+0.015
oral+0.015
 Majesty+0.014
Neg logit contributions
 lined-0.007
 fit-0.007
 really-0.007
 went-0.007
 started-0.007
Tokens
Token ID82
Feature activation+1.74e+00

Pos logit contributions
 lined+0.000
cy+0.000
aved+0.000
 fit+0.000
 went+0.000
Neg logit contributions
 mistake-0.000
 lair-0.000
lor-0.000
 privacy-0.000
oral-0.000
Token wrong
Token ID2642
Feature activation+1.38e+00

Pos logit contributions
 lined+0.060
 fit+0.059
aved+0.058
cy+0.058
 went+0.057
Neg logit contributions
 mistake-0.046
 lair-0.043
lor-0.041
oral-0.039
 property-0.036
Token,
Token ID11
Feature activation+2.49e-01

Pos logit contributions
 fit+0.047
 lined+0.046
cy+0.045
 went+0.045
 Derek+0.044
Neg logit contributions
 lair-0.036
 mistake-0.035
lor-0.034
oral-0.033
 privacy-0.031
Token rh
Token ID9529
Feature activation-1.29e+00

Pos logit contributions
 fit+0.005
 lined+0.004
 went+0.004
 Derek+0.004
 really+0.004
Neg logit contributions
 lair-0.011
 mistake-0.011
lor-0.010
oral-0.010
 privacy-0.009
Tokenin
Token ID259
Feature activation+2.30e-01

Pos logit contributions
 mistake+0.059
 lair+0.059
lor+0.055
 Majesty+0.055
 privacy+0.054
Neg logit contributions
cy-0.016
 fit-0.015
bo-0.015
 lined-0.015
 really-0.015
Tokenoc
Token ID420
Feature activation-4.01e+00

Pos logit contributions
 fit+0.007
 went+0.007
 lined+0.007
cy+0.007
 really+0.007
Neg logit contributions
 lair-0.006
lor-0.006
 mistake-0.006
 privacy-0.006
oral-0.006
Tokeneros
Token ID27498
Feature activation+3.57e-01

Pos logit contributions
 mistake+0.108
 lair+0.101
oral+0.094
lor+0.093
 privacy+0.085
Neg logit contributions
cy-0.127
 lined-0.127
aved-0.123
 Derek-0.122
bo-0.119
Token?
Token ID30
Feature activation-4.62e-01

Pos logit contributions
 went+0.011
 started+0.011
 fit+0.011
 lined+0.010
 got+0.010
Neg logit contributions
 lair-0.011
lor-0.011
 mistake-0.011
oral-0.010
 privacy-0.010
Tokenâ
Token ID22940
Feature activation-5.93e-01

Pos logit contributions
 mistake+0.018
 lair+0.017
oral+0.015
lor+0.015
 property+0.015
Neg logit contributions
cy-0.011
 lined-0.011
 fit-0.010
aved-0.010
 Derek-0.010
Token€
Token ID26391
Feature activation-5.30e-01

Pos logit contributions
 mistake+0.025
 lair+0.024
lor+0.023
oral+0.023
 property+0.022
Neg logit contributions
 lined-0.010
aved-0.010
 fit-0.010
cy-0.010
 swapped-0.010
Token said
Token ID531
Feature activation-2.12e-01

Pos logit contributions
 mistake+0.021
 lair+0.020
oral+0.019
 privacy+0.019
 witches+0.018
Neg logit contributions
 lined-0.012
 went-0.011
cy-0.011
 fit-0.011
aved-0.011
Token They
Token ID1119
Feature activation-6.81e-01

Pos logit contributions
 mistake+0.017
 lair+0.016
lor+0.016
 property+0.016
oral+0.015
Neg logit contributions
cy-0.005
 lined-0.005
 went-0.005
 fit-0.005
 Derek-0.004
Token saw
Token ID2497
Feature activation-1.36e+00

Pos logit contributions
 mistake+0.029
 lair+0.027
lor+0.027
oral+0.026
 property+0.025
Neg logit contributions
cy-0.012
 lined-0.011
aved-0.011
 fit-0.011
 swapped-0.010
Token the
Token ID262
Feature activation-5.42e-01

Pos logit contributions
 mistake+0.057
 lair+0.054
lor+0.054
oral+0.052
 property+0.050
Neg logit contributions
 lined-0.025
 fit-0.025
cy-0.024
aved-0.024
 swapped-0.024
Token rh
Token ID9529
Feature activation-2.77e+00

Pos logit contributions
lor+0.021
 lair+0.019
reetings+0.019
oral+0.018
 mistake+0.018
Neg logit contributions
 fit-0.013
cy-0.012
 went-0.012
 really-0.011
 lined-0.011
Tokenin
Token ID259
Feature activation-1.48e+00

Pos logit contributions
 mistake+0.118
 lair+0.112
lor+0.111
 Majesty+0.108
 property+0.105
Neg logit contributions
cy-0.051
 fit-0.045
aved-0.044
 lined-0.044
bo-0.042
Tokenoc
Token ID420
Feature activation-3.97e+00

Pos logit contributions
 mistake+0.037
lor+0.037
 lair+0.035
oral+0.033
reetings+0.033
Neg logit contributions
 fit-0.051
cy-0.051
 went-0.050
 lined-0.050
 started-0.049
Tokeneros
Token ID27498
Feature activation-8.91e-01

Pos logit contributions
 mistake+0.095
lor+0.086
 lair+0.086
oral+0.084
land+0.079
Neg logit contributions
cy-0.142
 lined-0.133
aved-0.132
 swapped-0.129
 fit-0.129
Token.
Token ID13
Feature activation+5.89e-01

Pos logit contributions
lor+0.035
oral+0.033
 mistake+0.032
reetings+0.032
 Majesty+0.032
Neg logit contributions
 went-0.021
 fit-0.020
 started-0.020
 lined-0.019
 really-0.019
Token They
Token ID1119
Feature activation-2.55e-01

Pos logit contributions
 lined+0.011
 fit+0.011
cy+0.011
aved+0.011
 swapped+0.011
Neg logit contributions
 mistake-0.024
 lair-0.023
lor-0.022
oral-0.022
 property-0.021
Token thanked
Token ID26280
Feature activation-1.82e-01

Pos logit contributions
 mistake+0.012
 lair+0.011
lor+0.011
oral+0.011
 property+0.010
Neg logit contributions
 lined-0.004
 fit-0.004
 went-0.004
aved-0.004
 swapped-0.004
Token Mom
Token ID11254
Feature activation+4.19e-01

Pos logit contributions
 lair+0.009
lor+0.009
 mistake+0.009
reetings+0.009
 property+0.008
Neg logit contributions
 went-0.002
 fit-0.002
cy-0.002
 lined-0.002
 swapped-0.002
Token hurt
Token ID5938
Feature activation-4.67e-02

Pos logit contributions
 mistake+0.019
 lair+0.018
lor+0.018
oral+0.018
 property+0.017
Neg logit contributions
 fit-0.008
 lined-0.008
aved-0.008
cy-0.008
 swapped-0.008
Token.
Token ID13
Feature activation-4.82e-01

Pos logit contributions
oral+0.002
 mistake+0.002
lor+0.002
 lair+0.002
 Majesty+0.002
Neg logit contributions
 fit-0.001
 went-0.001
 really-0.001
cy-0.001
 lined-0.001
Token The
Token ID383
Feature activation-6.19e-01

Pos logit contributions
 mistake+0.022
 lair+0.020
lor+0.020
 property+0.019
oral+0.019
Neg logit contributions
 lined-0.008
cy-0.008
 fit-0.007
 Derek-0.007
 went-0.007
Token rh
Token ID9529
Feature activation-2.34e+00

Pos logit contributions
lor+0.023
 mistake+0.022
 lair+0.021
reetings+0.021
oral+0.020
Neg logit contributions
 fit-0.015
 went-0.015
 started-0.014
 really-0.013
cy-0.013
Tokenin
Token ID259
Feature activation-1.22e+00

Pos logit contributions
 mistake+0.106
 lair+0.102
lor+0.099
 Majesty+0.098
brance+0.097
Neg logit contributions
cy-0.035
 fit-0.030
 lined-0.029
bo-0.028
 went-0.027
Tokenoc
Token ID420
Feature activation-3.97e+00

Pos logit contributions
 mistake+0.033
lor+0.031
 lair+0.031
 yourselves+0.030
 witches+0.029
Neg logit contributions
 went-0.041
 fit-0.040
cy-0.040
 started-0.039
bo-0.039
Tokeneros
Token ID27498
Feature activation-5.81e-01

Pos logit contributions
 mistake+0.098
 lair+0.089
lor+0.085
oral+0.085
land+0.080
Neg logit contributions
cy-0.138
 lined-0.130
aved-0.129
bo-0.127
 fit-0.126
Token kept
Token ID4030
Feature activation-4.55e-01

Pos logit contributions
 mistake+0.026
lor+0.026
oral+0.025
 lair+0.025
reetings+0.025
Neg logit contributions
 went-0.010
 started-0.009
 fit-0.009
 lined-0.009
cy-0.008
Token sn
Token ID3013
Feature activation-9.76e-01

Pos logit contributions
 mistake+0.017
lor+0.016
 lair+0.016
oral+0.016
 property+0.015
Neg logit contributions
 fit-0.011
 lined-0.011
aved-0.010
 went-0.010
cy-0.010
Tokenorting
Token ID24707
Feature activation-3.75e-01

Pos logit contributions
 mistake+0.032
lor+0.031
 lair+0.030
oral+0.029
 property+0.028
Neg logit contributions
 lined-0.026
 fit-0.026
 swapped-0.025
cy-0.025
aved-0.025
Token and
Token ID290
Feature activation-1.40e-01

Pos logit contributions
 mistake+0.013
 lair+0.013
lor+0.012
oral+0.012
 property+0.011
Neg logit contributions
 lined-0.009
 fit-0.009
cy-0.009
aved-0.009
 went-0.009
Token net
Token ID2010
Feature activation-2.41e-01

Pos logit contributions
lor+0.022
 mistake+0.022
 lair+0.022
oral+0.021
reetings+0.021
Neg logit contributions
 fit-0.013
cy-0.011
 went-0.011
 lined-0.011
aved-0.011
Token on
Token ID319
Feature activation-2.41e-01

Pos logit contributions
 mistake+0.011
lor+0.011
 lair+0.011
oral+0.011
 Majesty+0.010
Neg logit contributions
 fit-0.004
 lined-0.003
cy-0.003
 went-0.003
aved-0.003
Token the
Token ID262
Feature activation-3.27e-01

Pos logit contributions
 mistake+0.011
 lair+0.011
lor+0.010
oral+0.010
 property+0.010
Neg logit contributions
 lined-0.004
 fit-0.004
cy-0.003
 went-0.003
aved-0.003
Token rh
Token ID9529
Feature activation-2.20e+00

Pos logit contributions
lor+0.011
reetings+0.011
oral+0.011
 mistake+0.010
 lair+0.010
Neg logit contributions
 fit-0.009
 went-0.008
 started-0.008
 really-0.008
ny-0.008
Tokenin
Token ID259
Feature activation-1.09e+00

Pos logit contributions
 mistake+0.094
 lair+0.088
 Majesty+0.085
lor+0.084
oral+0.084
Neg logit contributions
cy-0.039
aved-0.034
 fit-0.034
bo-0.034
 lined-0.034
Tokenoc
Token ID420
Feature activation-3.96e+00

Pos logit contributions
 mistake+0.023
 lair+0.021
lor+0.019
oral+0.019
 property+0.017
Neg logit contributions
 lined-0.042
cy-0.042
 fit-0.042
aved-0.042
 went-0.041
Tokeneros
Token ID27498
Feature activation-3.71e-01

Pos logit contributions
 mistake+0.092
 lair+0.083
lor+0.080
oral+0.080
 witches+0.076
Neg logit contributions
cy-0.141
 lined-0.134
aved-0.134
 swapped-0.131
bo-0.130
Token.
Token ID13
Feature activation+1.46e-01

Pos logit contributions
 mistake+0.015
oral+0.014
lor+0.014
 lair+0.014
reetings+0.014
Neg logit contributions
 fit-0.008
 went-0.008
 lined-0.008
cy-0.007
 started-0.007
Token They
Token ID1119
Feature activation-2.54e-01

Pos logit contributions
 lined+0.003
cy+0.003
 fit+0.003
aved+0.003
 swapped+0.003
Neg logit contributions
 mistake-0.006
 lair-0.006
lor-0.006
oral-0.006
 property-0.005
Token put
Token ID1234
Feature activation-7.41e-01

Pos logit contributions
 mistake+0.010
 lair+0.010
lor+0.010
oral+0.009
 property+0.009
Neg logit contributions
 lined-0.005
 fit-0.005
cy-0.005
aved-0.005
 swapped-0.005
Token the
Token ID262
Feature activation-5.13e-01

Pos logit contributions
 mistake+0.025
 lair+0.023
lor+0.022
oral+0.021
 witches+0.021
Neg logit contributions
 lined-0.018
aved-0.018
organ-0.018
 fit-0.018
 swapped-0.017
Token They
Token ID1119
Feature activation-2.54e-01

Pos logit contributions
 lined+0.003
cy+0.003
 fit+0.003
aved+0.003
 swapped+0.003
Neg logit contributions
 mistake-0.006
 lair-0.006
lor-0.006
oral-0.006
 property-0.005
Token put
Token ID1234
Feature activation-7.41e-01

Pos logit contributions
 mistake+0.010
 lair+0.010
lor+0.010
oral+0.009
 property+0.009
Neg logit contributions
 lined-0.005
 fit-0.005
cy-0.005
aved-0.005
 swapped-0.005
Token the
Token ID262
Feature activation-5.13e-01

Pos logit contributions
 mistake+0.025
 lair+0.023
lor+0.022
oral+0.021
 witches+0.021
Neg logit contributions
 lined-0.018
aved-0.018
organ-0.018
 fit-0.018
 swapped-0.017
Token rh
Token ID9529
Feature activation-2.14e+00

Pos logit contributions
lor+0.020
 mistake+0.019
reetings+0.019
 lair+0.019
oral+0.018
Neg logit contributions
 fit-0.012
cy-0.010
 went-0.010
 started-0.010
 really-0.010
Tokenin
Token ID259
Feature activation-1.12e+00

Pos logit contributions
 mistake+0.090
 lair+0.086
lor+0.082
oral+0.080
 Majesty+0.079
Neg logit contributions
cy-0.041
 fit-0.038
aved-0.037
 lined-0.037
 swapped-0.035
Tokenoc
Token ID420
Feature activation-3.95e+00

Pos logit contributions
 mistake+0.034
 lair+0.031
lor+0.030
oral+0.030
 Majesty+0.028
Neg logit contributions
cy-0.035
 fit-0.034
 went-0.034
aved-0.034
 lined-0.034
Tokeneros
Token ID27498
Feature activation-9.70e-03

Pos logit contributions
 mistake+0.090
 lair+0.081
lor+0.078
oral+0.076
 witches+0.070
Neg logit contributions
cy-0.145
 lined-0.140
aved-0.140
 fit-0.137
 swapped-0.136
Token in
Token ID287
Feature activation-5.15e-01

Pos logit contributions
 mistake+0.000
oral+0.000
lor+0.000
 lair+0.000
reetings+0.000
Neg logit contributions
cy-0.000
 fit-0.000
 went-0.000
aved-0.000
 lined-0.000
Token the
Token ID262
Feature activation-1.20e+00

Pos logit contributions
 mistake+0.023
 lair+0.023
lor+0.023
oral+0.022
reetings+0.021
Neg logit contributions
 fit-0.008
 lined-0.007
cy-0.007
 went-0.007
aved-0.007
Token truck
Token ID7779
Feature activation-3.26e-01

Pos logit contributions
 mistake+0.044
lor+0.043
 lair+0.041
reetings+0.041
oral+0.040
Neg logit contributions
 went-0.028
 fit-0.028
cy-0.027
 swapped-0.026
 started-0.026
Token.
Token ID13
Feature activation+8.70e-02

Pos logit contributions
 mistake+0.013
oral+0.013
lor+0.012
 lair+0.012
reetings+0.012
Neg logit contributions
 lined-0.007
 fit-0.006
 went-0.006
cy-0.006
aved-0.006
Token loud
Token ID7812
Feature activation+1.05e-01

Pos logit contributions
 mistake+0.008
 lair+0.008
lor+0.007
oral+0.007
 property+0.007
Neg logit contributions
 lined-0.004
 fit-0.004
cy-0.004
aved-0.004
 swapped-0.004
Token.
Token ID13
Feature activation-5.36e-01

Pos logit contributions
 lined+0.002
 went+0.002
 fit+0.002
 started+0.002
 really+0.002
Neg logit contributions
oral-0.005
 mistake-0.005
lor-0.005
 lair-0.004
reetings-0.004
Token The
Token ID383
Feature activation-1.32e+00

Pos logit contributions
 mistake+0.027
 lair+0.025
lor+0.024
 property+0.024
oral+0.024
Neg logit contributions
 lined-0.007
 Tim-0.007
 went-0.007
 Bobby-0.007
 Derek-0.007
Token rh
Token ID9529
Feature activation-2.53e+00

Pos logit contributions
lor+0.053
 lair+0.048
 mistake+0.047
reetings+0.044
oral+0.043
Neg logit contributions
 went-0.033
 fit-0.032
 started-0.031
 really-0.029
 lined-0.029
Tokenin
Token ID259
Feature activation-1.57e+00

Pos logit contributions
 mistake+0.116
 lair+0.113
lor+0.110
 Majesty+0.109
brance+0.108
Neg logit contributions
cy-0.038
 fit-0.030
 lined-0.030
aved-0.030
bo-0.029
Tokenoc
Token ID420
Feature activation-3.95e+00

Pos logit contributions
 mistake+0.041
 lair+0.039
lor+0.038
oral+0.036
 witches+0.035
Neg logit contributions
 went-0.054
cy-0.053
 fit-0.053
 lined-0.053
 started-0.052
Tokeneros
Token ID27498
Feature activation-6.17e-01

Pos logit contributions
 mistake+0.101
 lair+0.090
lor+0.089
oral+0.089
land+0.084
Neg logit contributions
cy-0.131
 lined-0.125
aved-0.124
bo-0.120
 went-0.120
Token was
Token ID373
Feature activation-2.50e-02

Pos logit contributions
 mistake+0.029
lor+0.029
 lair+0.028
oral+0.028
reetings+0.026
Neg logit contributions
 went-0.010
 started-0.010
 lined-0.010
 fit-0.009
 got-0.008
Token so
Token ID523
Feature activation-5.38e-01

Pos logit contributions
 mistake+0.001
lor+0.001
oral+0.001
 lair+0.001
 privacy+0.001
Neg logit contributions
 lined-0.000
 fit-0.000
 went-0.000
 started-0.000
aved-0.000
Token scared
Token ID12008
Feature activation-6.13e-01

Pos logit contributions
 mistake+0.022
 lair+0.021
lor+0.021
oral+0.021
 property+0.020
Neg logit contributions
 lined-0.010
 fit-0.010
aved-0.010
cy-0.010
 went-0.010
Token.
Token ID13
Feature activation-7.36e-01

Pos logit contributions
 mistake+0.026
 lair+0.025
oral+0.024
lor+0.024
 Majesty+0.022
Neg logit contributions
 lined-0.012
 fit-0.012
 went-0.012
 started-0.011
cy-0.011
Token She
Token ID1375
Feature activation+4.40e-02

Pos logit contributions
 lined+0.002
cy+0.002
 went+0.002
 fit+0.002
 swapped+0.002
Neg logit contributions
 mistake-0.005
 lair-0.004
lor-0.004
oral-0.004
 property-0.004
Token saw
Token ID2497
Feature activation-1.17e+00

Pos logit contributions
 went+0.001
 fit+0.001
 lined+0.001
 started+0.001
cy+0.001
Neg logit contributions
 mistake-0.002
lor-0.002
 lair-0.002
 property-0.002
oral-0.002
Token the
Token ID262
Feature activation-1.05e+00

Pos logit contributions
 mistake+0.047
 lair+0.045
lor+0.044
oral+0.043
 property+0.041
Neg logit contributions
 lined-0.024
 fit-0.023
cy-0.023
aved-0.023
 went-0.022
Token rh
Token ID9529
Feature activation-2.64e+00

Pos logit contributions
lor+0.038
 mistake+0.038
 lair+0.037
oral+0.035
reetings+0.034
Neg logit contributions
 fit-0.026
 lined-0.024
 went-0.024
cy-0.024
 swapped-0.023
Tokenin
Token ID259
Feature activation-1.66e+00

Pos logit contributions
 mistake+0.112
 lair+0.107
lor+0.102
 Majesty+0.100
 property+0.099
Neg logit contributions
cy-0.049
 fit-0.046
 lined-0.045
aved-0.044
 swapped-0.042
Tokenoc
Token ID420
Feature activation-3.94e+00

Pos logit contributions
 mistake+0.045
 lair+0.043
lor+0.041
oral+0.037
 Majesty+0.037
Neg logit contributions
cy-0.057
 fit-0.056
 went-0.056
 lined-0.055
 started-0.054
Tokeneros
Token ID27498
Feature activation-4.94e-01

Pos logit contributions
 mistake+0.088
 lair+0.080
lor+0.076
oral+0.074
 witches+0.066
Neg logit contributions
cy-0.147
 lined-0.143
aved-0.142
 fit-0.141
 swapped-0.140
Token and
Token ID290
Feature activation-2.53e-01

Pos logit contributions
lor+0.019
 mistake+0.018
oral+0.018
 lair+0.018
reetings+0.017
Neg logit contributions
 went-0.012
 fit-0.012
 started-0.012
 really-0.011
 lined-0.011
Token knew
Token ID2993
Feature activation-2.97e-01

Pos logit contributions
lor+0.013
 mistake+0.012
 lair+0.012
oral+0.012
reetings+0.012
Neg logit contributions
 went-0.004
 started-0.003
 lined-0.003
 fit-0.003
 swapped-0.003
Token he
Token ID339
Feature activation-6.10e-01

Pos logit contributions
 mistake+0.014
 lair+0.014
oral+0.014
lor+0.014
reetings+0.013
Neg logit contributions
 went-0.004
 fit-0.004
 started-0.004
 really-0.004
 Tim-0.004
Token was
Token ID373
Feature activation+1.27e+00

Pos logit contributions
 mistake+0.025
 lair+0.024
lor+0.023
oral+0.023
 property+0.022
Neg logit contributions
 lined-0.011
aved-0.011
cy-0.011
 fit-0.011
 swapped-0.011
Token care
Token ID1337
Feature activation-1.25e-01

Pos logit contributions
 mistake+0.002
 lair+0.001
 witches+0.001
 privacy+0.001
lor+0.001
Neg logit contributions
 lined-0.001
cy-0.001
aved-0.001
 swapped-0.001
 fit-0.001
Token of
Token ID286
Feature activation+2.69e-01

Pos logit contributions
 mistake+0.005
 lair+0.005
lor+0.005
oral+0.005
 Majesty+0.005
Neg logit contributions
 fit-0.003
aved-0.002
 really-0.002
 lined-0.002
cy-0.002
Token the
Token ID262
Feature activation-8.57e-01

Pos logit contributions
 fit+0.004
 really+0.003
 went+0.003
 Bobby+0.003
 Derek+0.003
Neg logit contributions
oral-0.013
 lair-0.013
lor-0.013
reetings-0.012
 mistake-0.012
Token rh
Token ID9529
Feature activation-2.15e+00

Pos logit contributions
lor+0.035
oral+0.031
reetings+0.031
 lair+0.030
 mistake+0.029
Neg logit contributions
 fit-0.020
 went-0.018
 really-0.018
cy-0.017
 started-0.016
Tokenin
Token ID259
Feature activation-1.45e+00

Pos logit contributions
 mistake+0.094
 lair+0.090
lor+0.087
 Majesty+0.085
 property+0.085
Neg logit contributions
cy-0.036
 fit-0.031
aved-0.031
bo-0.030
 lined-0.030
Tokenoc
Token ID420
Feature activation-3.94e+00

Pos logit contributions
 mistake+0.041
lor+0.039
 lair+0.039
oral+0.037
reetings+0.036
Neg logit contributions
cy-0.047
 fit-0.045
 went-0.044
 lined-0.043
ny-0.043
Tokeneros
Token ID27498
Feature activation-4.88e-01

Pos logit contributions
 mistake+0.090
 lair+0.082
lor+0.079
oral+0.077
reetings+0.071
Neg logit contributions
cy-0.143
 lined-0.137
aved-0.137
 swapped-0.134
 fit-0.133
Token and
Token ID290
Feature activation-4.00e-01

Pos logit contributions
oral+0.020
lor+0.019
 mistake+0.019
 lair+0.019
reetings+0.019
Neg logit contributions
 fit-0.010
cy-0.009
 went-0.009
 really-0.009
 started-0.009
Token it
Token ID340
Feature activation+1.81e-01

Pos logit contributions
lor+0.018
oral+0.017
 mistake+0.017
 lair+0.017
reetings+0.016
Neg logit contributions
 went-0.008
 started-0.008
 fit-0.008
 lined-0.007
 swapped-0.006
Token needed
Token ID2622
Feature activation-6.32e-01

Pos logit contributions
 lined+0.004
cy+0.004
aved+0.004
 swapped+0.003
 fit+0.003
Neg logit contributions
 mistake-0.007
 lair-0.007
lor-0.006
oral-0.006
 property-0.006
Token to
Token ID284
Feature activation-3.71e-02

Pos logit contributions
lor+0.027
 lair+0.027
 mistake+0.027
oral+0.026
 Majesty+0.026
Neg logit contributions
 fit-0.012
 went-0.012
 really-0.011
 started-0.011
 lined-0.010
Token try
Token ID1949
Feature activation-6.67e-01

Pos logit contributions
aved+0.013
organ+0.012
 lined+0.012
cy+0.012
inn+0.012
Neg logit contributions
 mistake-0.020
 lair-0.019
 witches-0.017
lor-0.017
 reading-0.017
Token to
Token ID284
Feature activation+3.34e-03

Pos logit contributions
lor+0.030
oral+0.030
 lair+0.029
 mistake+0.028
 privacy+0.028
Neg logit contributions
 really-0.011
 fit-0.011
 lined-0.010
 went-0.010
 started-0.009
Token grab
Token ID5552
Feature activation-4.29e-01

Pos logit contributions
attered+0.000
aved+0.000
 lined+0.000
cy+0.000
 swapped+0.000
Neg logit contributions
 mistake-0.000
 lair-0.000
 property-0.000
 conclusions-0.000
lor-0.000
Token the
Token ID262
Feature activation+6.64e-02

Pos logit contributions
 mistake+0.014
 conclusions+0.011
 yourselves+0.011
 lair+0.011
 property+0.011
Neg logit contributions
aved-0.012
organ-0.012
 lined-0.012
 swapped-0.011
attered-0.011
Token av
Token ID1196
Feature activation-7.71e-01

Pos logit contributions
 fit+0.001
cy+0.001
 went+0.001
aved+0.001
 lined+0.001
Neg logit contributions
lor-0.003
 mistake-0.003
 lair-0.003
oral-0.002
land-0.002
Tokenoc
Token ID420
Feature activation-3.93e+00

Pos logit contributions
 mistake+0.011
 lair+0.010
lor+0.010
oral+0.009
 property+0.008
Neg logit contributions
 lined-0.035
aved-0.034
 fit-0.034
 swapped-0.034
 Derek-0.033
Tokenados
Token ID22484
Feature activation+2.98e-01

Pos logit contributions
 mistake+0.133
 lair+0.126
lor+0.120
oral+0.117
land+0.114
Neg logit contributions
cy-0.099
 lined-0.096
aved-0.095
 fit-0.093
 swapped-0.093
Token.
Token ID13
Feature activation+4.43e-01

Pos logit contributions
 lined+0.007
aved+0.007
cy+0.007
 fit+0.007
 swapped+0.007
Neg logit contributions
 mistake-0.011
 lair-0.010
lor-0.010
oral-0.010
 property-0.009
Token But
Token ID887
Feature activation-6.41e-01

Pos logit contributions
 lined+0.008
 fit+0.008
cy+0.008
aved+0.008
 swapped+0.008
Neg logit contributions
 mistake-0.019
 lair-0.018
lor-0.017
oral-0.017
 property-0.016
Token the
Token ID262
Feature activation-8.96e-01

Pos logit contributions
 mistake+0.026
 lair+0.025
lor+0.024
oral+0.023
 property+0.023
Neg logit contributions
aved-0.013
 lined-0.012
cy-0.012
 fit-0.012
 swapped-0.012
Token rock
Token ID3881
Feature activation+3.95e-01

Pos logit contributions
lor+0.033
 mistake+0.032
 lair+0.031
land+0.030
oral+0.030
Neg logit contributions
 fit-0.021
 went-0.021
 started-0.020
 lined-0.020
 swapped-0.019
TokenIt
Token ID1026
Feature activation+1.52e-01

Pos logit contributions
 mistake+0.023
 lair+0.021
lor+0.021
oral+0.021
 property+0.019
Neg logit contributions
 lined-0.021
aved-0.020
 fit-0.020
cy-0.020
 swapped-0.020
Token is
Token ID318
Feature activation+1.45e-01

Pos logit contributions
 lined+0.005
 fit+0.005
cy+0.005
aved+0.005
 swapped+0.004
Neg logit contributions
 mistake-0.005
 lair-0.004
lor-0.004
oral-0.004
 property-0.004
Token the
Token ID262
Feature activation-1.44e+00

Pos logit contributions
 fit+0.002
 lined+0.002
 went+0.002
cy+0.002
aved+0.002
Neg logit contributions
 mistake-0.006
lor-0.006
 lair-0.006
oral-0.006
 property-0.006
Token rh
Token ID9529
Feature activation-2.39e+00

Pos logit contributions
 mistake+0.050
lor+0.049
 lair+0.047
oral+0.047
 witches+0.044
Neg logit contributions
 fit-0.037
 lined-0.034
 went-0.033
cy-0.032
 really-0.032
Tokenin
Token ID259
Feature activation-1.70e+00

Pos logit contributions
 mistake+0.109
 lair+0.105
 Majesty+0.100
lor+0.099
oral+0.097
Neg logit contributions
cy-0.038
 fit-0.037
aved-0.034
 lined-0.032
bo-0.032
Tokenoc
Token ID420
Feature activation-3.93e+00

Pos logit contributions
 mistake+0.048
 lair+0.047
oral+0.045
lor+0.045
joy+0.042
Neg logit contributions
 fit-0.057
 went-0.054
cy-0.053
 lined-0.052
 really-0.052
Tokeneros
Token ID27498
Feature activation-6.62e-01

Pos logit contributions
 mistake+0.081
 lair+0.073
lor+0.069
oral+0.067
 property+0.060
Neg logit contributions
 lined-0.153
cy-0.153
 fit-0.152
aved-0.151
 swapped-0.147
Token!"
Token ID2474
Feature activation+2.98e-01

Pos logit contributions
oral+0.028
 mistake+0.027
 lair+0.027
lor+0.026
joy+0.025
Neg logit contributions
 fit-0.015
 went-0.013
 really-0.013
cy-0.012
 started-0.012
Token
Token ID198
Feature activation+4.41e-01

Pos logit contributions
 went+0.004
 really+0.004
 lined+0.004
 fit+0.003
 started+0.003
Neg logit contributions
 mistake-0.015
 lair-0.015
lor-0.014
 property-0.014
 Majesty-0.013
Token
Token ID198
Feature activation+9.18e-01

Pos logit contributions
 went+0.007
 fit+0.007
 lined+0.007
 really+0.006
cy+0.006
Neg logit contributions
 mistake-0.019
 lair-0.019
lor-0.018
 property-0.018
 witches-0.018
TokenThe
Token ID464
Feature activation-5.17e-01

Pos logit contributions
 went+0.022
 fit+0.022
 lined+0.022
cy+0.021
 swapped+0.021
Neg logit contributions
 mistake-0.034
 lair-0.034
lor-0.032
 witches-0.030
oral-0.030
Token it
Token ID340
Feature activation+1.40e+00

Pos logit contributions
aved+0.037
cy+0.037
 lined+0.036
 fit+0.035
 swapped+0.035
Neg logit contributions
 mistake-0.054
 lair-0.048
lor-0.047
oral-0.045
 property-0.044
Token,
Token ID11
Feature activation-1.03e-01

Pos logit contributions
 fit+0.029
 lined+0.029
cy+0.029
aved+0.028
 went+0.028
Neg logit contributions
 mistake-0.055
 lair-0.052
lor-0.051
oral-0.050
 property-0.048
Token the
Token ID262
Feature activation-1.06e+00

Pos logit contributions
 mistake+0.005
 lair+0.005
lor+0.005
oral+0.005
 property+0.004
Neg logit contributions
 fit-0.001
 lined-0.001
 went-0.001
cy-0.001
 Derek-0.001
Token rh
Token ID9529
Feature activation-2.25e+00

Pos logit contributions
lor+0.041
 mistake+0.040
 lair+0.039
oral+0.038
reetings+0.037
Neg logit contributions
 fit-0.023
 went-0.021
 lined-0.020
 really-0.020
cy-0.020
Tokenin
Token ID259
Feature activation-1.29e+00

Pos logit contributions
 mistake+0.102
 lair+0.098
lor+0.095
 Majesty+0.093
 property+0.093
Neg logit contributions
cy-0.034
 fit-0.030
 lined-0.029
bo-0.029
aved-0.029
Tokenoc
Token ID420
Feature activation-3.92e+00

Pos logit contributions
 mistake+0.034
 lair+0.032
lor+0.030
 witches+0.029
 privacy+0.029
Neg logit contributions
 went-0.044
 fit-0.044
 lined-0.043
 started-0.043
cy-0.042
Tokeneros
Token ID27498
Feature activation+1.20e-02

Pos logit contributions
 mistake+0.096
 lair+0.087
lor+0.084
oral+0.083
land+0.078
Neg logit contributions
cy-0.137
 lined-0.128
aved-0.128
bo-0.126
 swapped-0.125
Token smiled
Token ID13541
Feature activation+4.31e-01

Pos logit contributions
 went+0.000
 started+0.000
 fit+0.000
 lined+0.000
 swapped+0.000
Neg logit contributions
 mistake-0.001
lor-0.001
 lair-0.001
oral-0.001
 property-0.001
Token.
Token ID13
Feature activation-2.86e-01

Pos logit contributions
 fit+0.011
 lined+0.011
 went+0.011
cy+0.011
aved+0.011
Neg logit contributions
 mistake-0.015
 lair-0.014
lor-0.014
oral-0.013
 property-0.013
Token He
Token ID679
Feature activation-3.80e-01

Pos logit contributions
 mistake+0.013
 lair+0.012
 property+0.011
lor+0.011
 privacy+0.011
Neg logit contributions
 lined-0.005
cy-0.005
 went-0.005
 Derek-0.005
 fit-0.005
Token had
Token ID550
Feature activation-1.14e+00

Pos logit contributions
 mistake+0.017
 lair+0.017
lor+0.017
oral+0.016
 property+0.016
Neg logit contributions
 went-0.006
 fit-0.006
 lined-0.006
 started-0.006
 swapped-0.006
Token They
Token ID1119
Feature activation+2.93e-01

Pos logit contributions
 lined+0.019
 fit+0.018
cy+0.018
aved+0.018
 swapped+0.017
Neg logit contributions
 mistake-0.042
 lair-0.040
lor-0.039
oral-0.038
 property-0.037
Token heard
Token ID2982
Feature activation-1.29e+00

Pos logit contributions
 lined+0.006
 fit+0.006
 went+0.006
aved+0.006
 started+0.006
Neg logit contributions
 mistake-0.012
 lair-0.011
lor-0.011
oral-0.011
 property-0.010
Token the
Token ID262
Feature activation-6.27e-01

Pos logit contributions
 mistake+0.058
lor+0.057
 lair+0.057
oral+0.055
reetings+0.054
Neg logit contributions
 lined-0.019
 went-0.019
 fit-0.019
 started-0.017
 swapped-0.017
Token rh
Token ID9529
Feature activation-2.43e+00

Pos logit contributions
lor+0.023
 lair+0.021
reetings+0.021
 mistake+0.021
oral+0.020
Neg logit contributions
 fit-0.015
 went-0.015
 started-0.014
 lined-0.014
 really-0.014
Tokenin
Token ID259
Feature activation-1.47e+00

Pos logit contributions
 mistake+0.110
 lair+0.109
 Majesty+0.106
lor+0.102
 privacy+0.101
Neg logit contributions
cy-0.037
 fit-0.030
aved-0.029
bo-0.029
 lined-0.029
Tokenoc
Token ID420
Feature activation-3.92e+00

Pos logit contributions
 mistake+0.036
 lair+0.035
lor+0.034
oral+0.031
 Majesty+0.030
Neg logit contributions
 went-0.051
 lined-0.051
 fit-0.051
cy-0.050
 started-0.050
Tokeneros
Token ID27498
Feature activation-7.42e-01

Pos logit contributions
 mistake+0.084
 lair+0.078
lor+0.074
oral+0.072
 witches+0.065
Neg logit contributions
cy-0.147
 lined-0.145
aved-0.142
 fit-0.142
 swapped-0.140
Token sn
Token ID3013
Feature activation-6.21e-01

Pos logit contributions
lor+0.030
oral+0.029
 mistake+0.028
 lair+0.028
reetings+0.027
Neg logit contributions
 fit-0.015
 went-0.015
 lined-0.014
 started-0.014
 really-0.014
Tokenore
Token ID382
Feature activation-2.20e+00

Pos logit contributions
lor+0.019
 mistake+0.018
oral+0.017
 lair+0.017
 property+0.016
Neg logit contributions
 fit-0.019
 lined-0.018
 matched-0.018
 swapped-0.017
organ-0.017
Token.
Token ID13
Feature activation+9.50e-01

Pos logit contributions
 lair+0.088
 mistake+0.088
lor+0.086
oral+0.085
 Majesty+0.082
Neg logit contributions
 fit-0.042
 lined-0.040
 went-0.039
cy-0.038
 really-0.037
Token
Token ID198
Feature activation-5.35e-01

Pos logit contributions
 lined+0.019
 fit+0.019
aved+0.019
cy+0.019
 swapped+0.018
Neg logit contributions
 mistake-0.038
 lair-0.036
lor-0.035
oral-0.034
 property-0.033
Token.
Token ID13
Feature activation-2.12e-01

Pos logit contributions
 lined+0.032
 fit+0.032
 went+0.031
 started+0.030
cy+0.029
Neg logit contributions
oral-0.069
 mistake-0.069
 lair-0.066
lor-0.066
reetings-0.062
Token
Token ID220
Feature activation+2.62e-01

Pos logit contributions
 mistake+0.010
 lair+0.010
 property+0.009
lor+0.009
oral+0.009
Neg logit contributions
 Bobby-0.003
 went-0.003
 lined-0.003
cy-0.003
 Tim-0.003
Token
Token ID198
Feature activation-7.58e-02

Pos logit contributions
 lined+0.005
aved+0.005
 went+0.004
cy+0.004
 fit+0.004
Neg logit contributions
 mistake-0.011
 lair-0.011
lor-0.010
 privacy-0.010
 property-0.010
Token
Token ID198
Feature activation-3.45e-02

Pos logit contributions
 mistake+0.003
 lair+0.003
 property+0.003
 privacy+0.003
 witches+0.003
Neg logit contributions
 lined-0.001
inn-0.001
 really-0.001
 fit-0.001
cy-0.001
TokenOne
Token ID3198
Feature activation-3.16e-01

Pos logit contributions
 mistake+0.001
 lair+0.001
 witches+0.001
lor+0.001
 privacy+0.001
Neg logit contributions
 lined-0.001
 swapped-0.001
inn-0.001
 fit-0.001
 really-0.001
Token day
Token ID1110
Feature activation+2.10e+00

Pos logit contributions
 mistake+0.016
lor+0.016
 lair+0.015
oral+0.014
 property+0.014
Neg logit contributions
 fit-0.004
cy-0.004
 went-0.003
 lined-0.003
 started-0.003
Token,
Token ID11
Feature activation+1.07e+00

Pos logit contributions
 lined+0.046
aved+0.046
cy+0.046
 fit+0.045
 swapped+0.044
Neg logit contributions
 mistake-0.078
 lair-0.073
lor-0.072
oral-0.070
 witches-0.067
Token a
Token ID257
Feature activation+7.01e-01

Pos logit contributions
 lined+0.018
 fit+0.018
cy+0.018
aved+0.018
 swapped+0.017
Neg logit contributions
 mistake-0.046
 lair-0.044
lor-0.043
oral-0.042
 property-0.041
Token little
Token ID1310
Feature activation+1.03e+00

Pos logit contributions
 lined+0.018
cy+0.018
aved+0.018
 fit+0.018
 swapped+0.017
Neg logit contributions
 mistake-0.024
 lair-0.023
lor-0.022
oral-0.021
 property-0.020
Token girl
Token ID2576
Feature activation+8.58e-01

Pos logit contributions
 fit+0.025
cy+0.025
 lined+0.025
aved+0.024
 went+0.024
Neg logit contributions
 mistake-0.037
 lair-0.035
lor-0.035
oral-0.034
 property-0.032
Token saw
Token ID2497
Feature activation-5.96e-01

Pos logit contributions
 went+0.016
 started+0.014
cy+0.013
 lined+0.013
 fit+0.013
Neg logit contributions
 mistake-0.041
 lair-0.039
lor-0.039
oral-0.039
 property-0.037
Token could
Token ID714
Feature activation+9.29e-01

Pos logit contributions
 lined+0.010
 fit+0.010
cy+0.010
aved+0.010
 swapped+0.010
Neg logit contributions
 mistake-0.031
 lair-0.030
lor-0.029
oral-0.029
 property-0.028
Token.
Token ID13
Feature activation-7.53e-02

Pos logit contributions
 fit+0.017
 went+0.017
 lined+0.015
 started+0.015
aved+0.015
Neg logit contributions
 mistake-0.042
oral-0.041
lor-0.040
 lair-0.039
 witches-0.039
Token
Token ID220
Feature activation+1.18e-01

Pos logit contributions
 mistake+0.003
 lair+0.003
lor+0.003
oral+0.003
 property+0.003
Neg logit contributions
 lined-0.002
cy-0.002
 fit-0.002
aved-0.002
 swapped-0.002
Token
Token ID198
Feature activation-6.02e-01

Pos logit contributions
 lined+0.002
cy+0.002
aved+0.002
 fit+0.002
 went+0.002
Neg logit contributions
 mistake-0.005
 lair-0.005
 property-0.004
lor-0.004
 privacy-0.004
Token
Token ID198
Feature activation-2.71e-01

Pos logit contributions
 mistake+0.025
 lair+0.024
 property+0.023
 privacy+0.023
 witches+0.023
Neg logit contributions
 lined-0.012
 fit-0.011
 really-0.011
inn-0.011
 went-0.011
TokenWhen
Token ID2215
Feature activation+2.07e+00

Pos logit contributions
 mistake+0.011
 lair+0.011
 privacy+0.010
 property+0.010
 witches+0.010
Neg logit contributions
 lined-0.006
 fit-0.006
 really-0.006
inn-0.005
 went-0.005
Token night
Token ID1755
Feature activation+2.28e+00

Pos logit contributions
 lined+0.036
 fit+0.036
cy+0.036
aved+0.035
 swapped+0.034
Neg logit contributions
 mistake-0.089
 lair-0.084
lor-0.082
oral-0.081
 property-0.078
Token came
Token ID1625
Feature activation+1.33e+00

Pos logit contributions
aved+0.061
cy+0.061
idget+0.060
organ+0.059
 swapped+0.057
Neg logit contributions
 mistake-0.067
lor-0.063
oral-0.063
land-0.060
 witch-0.057
Token,
Token ID11
Feature activation+3.80e-01

Pos logit contributions
cy+0.026
 went+0.025
 fit+0.025
aved+0.025
 lined+0.024
Neg logit contributions
 mistake-0.057
 lair-0.054
lor-0.052
oral-0.051
 privacy-0.050
Token the
Token ID262
Feature activation+3.26e-01

Pos logit contributions
 lined+0.007
aved+0.007
cy+0.007
 fit+0.007
 swapped+0.007
Neg logit contributions
 mistake-0.016
 lair-0.015
lor-0.015
oral-0.014
 property-0.014
Token white
Token ID2330
Feature activation-5.74e-02

Pos logit contributions
 lined+0.007
 fit+0.007
cy+0.007
aved+0.007
 swapped+0.006
Neg logit contributions
 mistake-0.013
 lair-0.012
lor-0.012
oral-0.012
 property-0.011
Token mom
Token ID1995
Feature activation-4.68e-01

Pos logit contributions
 lined+0.016
 fit+0.016
cy+0.016
aved+0.016
 swapped+0.015
Neg logit contributions
 mistake-0.028
 lair-0.026
lor-0.026
oral-0.025
 property-0.024
Token bought
Token ID5839
Feature activation-3.05e-01

Pos logit contributions
 mistake+0.019
 lair+0.019
lor+0.018
oral+0.018
 property+0.017
Neg logit contributions
 lined-0.009
 fit-0.009
cy-0.009
aved-0.009
 went-0.008
Token him
Token ID683
Feature activation+7.63e-01

Pos logit contributions
 mistake+0.014
 lair+0.013
lor+0.013
oral+0.013
 property+0.012
Neg logit contributions
 lined-0.005
 fit-0.005
cy-0.005
aved-0.005
 swapped-0.004
Token the
Token ID262
Feature activation+6.55e-01

Pos logit contributions
 fit+0.015
cy+0.015
 lined+0.014
 went+0.014
aved+0.014
Neg logit contributions
 mistake-0.031
 lair-0.030
lor-0.029
oral-0.029
 property-0.027
Token mint
Token ID21043
Feature activation+9.43e-01

Pos logit contributions
 fit+0.014
cy+0.013
 went+0.013
 lined+0.013
 swapped+0.013
Neg logit contributions
 mistake-0.025
 lair-0.025
lor-0.025
oral-0.023
 witches-0.023
Tokens
Token ID82
Feature activation+2.06e+00

Pos logit contributions
 lined+0.024
aved+0.024
 swapped+0.023
cy+0.022
 fit+0.022
Neg logit contributions
 mistake-0.033
 lair-0.030
lor-0.030
oral-0.029
 property-0.028
Token and
Token ID290
Feature activation+6.95e-01

Pos logit contributions
 fit+0.049
 lined+0.046
cy+0.046
 went+0.046
aved+0.044
Neg logit contributions
 mistake-0.076
 lair-0.075
lor-0.074
oral-0.071
 witches-0.067
Token he
Token ID339
Feature activation-6.72e-01

Pos logit contributions
 fit+0.012
 went+0.011
 lined+0.011
 started+0.011
 swapped+0.011
Neg logit contributions
 mistake-0.030
 lair-0.029
lor-0.029
oral-0.029
 property-0.027
Token enjoyed
Token ID8359
Feature activation-9.51e-02

Pos logit contributions
 mistake+0.028
 lair+0.027
lor+0.026
oral+0.026
 property+0.025
Neg logit contributions
 lined-0.012
 fit-0.012
cy-0.012
aved-0.012
 swapped-0.011
Token them
Token ID606
Feature activation+5.07e-01

Pos logit contributions
 mistake+0.004
 lair+0.004
lor+0.004
oral+0.004
 property+0.004
Neg logit contributions
 fit-0.002
 lined-0.002
 went-0.002
 swapped-0.002
 really-0.001
Token so
Token ID523
Feature activation-2.53e-01

Pos logit contributions
 fit+0.011
cy+0.010
 lined+0.010
 went+0.010
 really+0.009
Neg logit contributions
 mistake-0.020
 lair-0.020
lor-0.019
oral-0.019
 property-0.018
Token the
Token ID262
Feature activation+3.73e-02

Pos logit contributions
 mistake+0.013
 lair+0.013
lor+0.013
oral+0.012
 property+0.012
Neg logit contributions
 fit-0.007
 lined-0.006
 swapped-0.006
 went-0.006
cy-0.006
Token guitar
Token ID10047
Feature activation+1.89e-01

Pos logit contributions
 fit+0.001
 lined+0.001
cy+0.001
aved+0.001
 swapped+0.001
Neg logit contributions
 mistake-0.001
 lair-0.001
lor-0.001
oral-0.001
 property-0.001
Token and
Token ID290
Feature activation-1.93e-01

Pos logit contributions
 fit+0.004
 lined+0.004
cy+0.004
aved+0.004
 went+0.004
Neg logit contributions
 mistake-0.008
lor-0.007
 lair-0.007
oral-0.007
 property-0.007
Token make
Token ID787
Feature activation-3.67e-01

Pos logit contributions
 mistake+0.009
lor+0.008
 lair+0.008
oral+0.008
 property+0.008
Neg logit contributions
 fit-0.003
 lined-0.003
 went-0.003
 swapped-0.003
 started-0.003
Token fun
Token ID1257
Feature activation+4.45e-01

Pos logit contributions
 mistake+0.014
 lair+0.013
lor+0.012
oral+0.012
 property+0.012
Neg logit contributions
cy-0.009
aved-0.008
 lined-0.008
 fit-0.008
 swapped-0.008
Token sounds
Token ID5238
Feature activation+2.01e+00

Pos logit contributions
 fit+0.011
 lined+0.010
cy+0.010
 went+0.010
 swapped+0.010
Neg logit contributions
 mistake-0.016
 lair-0.016
lor-0.016
oral-0.015
 property-0.014
Token.
Token ID13
Feature activation-3.10e-01

Pos logit contributions
aved+0.066
cy+0.065
 Derek+0.061
 lined+0.060
organ+0.059
Neg logit contributions
 mistake-0.055
 lair-0.049
 reading-0.047
 witches-0.045
oral-0.044
Token He
Token ID679
Feature activation-3.09e-01

Pos logit contributions
 mistake+0.009
 lair+0.009
lor+0.009
oral+0.008
 property+0.008
Neg logit contributions
aved-0.009
cy-0.009
 fit-0.009
 lined-0.009
 swapped-0.009
Token tried
Token ID3088
Feature activation-7.90e-01

Pos logit contributions
 mistake+0.011
 lair+0.010
oral+0.009
 witch+0.009
 reading+0.009
Neg logit contributions
pool-0.007
aved-0.007
 Derek-0.006
cy-0.006
hot-0.006
Token to
Token ID284
Feature activation-2.12e-01

Pos logit contributions
lor+0.030
 mistake+0.030
 lair+0.029
oral+0.029
 property+0.027
Neg logit contributions
 fit-0.017
 lined-0.017
 went-0.016
 really-0.016
 started-0.016
Token play
Token ID711
Feature activation-1.25e-01

Pos logit contributions
 mistake+0.008
 lair+0.008
lor+0.007
oral+0.007
 property+0.007
Neg logit contributions
aved-0.005
cy-0.005
 lined-0.004
pool-0.004
 swapped-0.004
Token happy
Token ID3772
Feature activation+4.58e-01

Pos logit contributions
aved+0.010
 lined+0.009
cy+0.009
pool+0.009
 swapped+0.009
Neg logit contributions
 mistake-0.016
 lair-0.015
lor-0.014
 witches-0.013
 reading-0.013
Token.
Token ID13
Feature activation-2.97e-01

Pos logit contributions
 fit+0.006
 went+0.006
 really+0.006
 started+0.005
 lined+0.005
Neg logit contributions
oral-0.023
lor-0.023
 mistake-0.022
cod-0.021
reetings-0.021
Token But
Token ID887
Feature activation-3.97e-01

Pos logit contributions
 mistake+0.013
 lair+0.012
lor+0.012
oral+0.011
 property+0.011
Neg logit contributions
cy-0.006
 lined-0.006
 went-0.005
 fit-0.005
aved-0.005
Token one
Token ID530
Feature activation-9.91e-01

Pos logit contributions
lor+0.017
 lair+0.017
 mistake+0.017
oral+0.016
cod+0.015
Neg logit contributions
 went-0.007
 fit-0.007
cy-0.007
 lined-0.007
 started-0.006
Token day
Token ID1110
Feature activation+1.63e+00

Pos logit contributions
 mistake+0.042
 lair+0.038
oral+0.038
lor+0.036
 witch+0.036
Neg logit contributions
aved-0.019
 lined-0.018
 swapped-0.018
 Derek-0.017
 fit-0.016
Token when
Token ID618
Feature activation+2.09e+00

Pos logit contributions
aved+0.040
 lined+0.039
cy+0.039
 fit+0.038
 swapped+0.038
Neg logit contributions
 mistake-0.059
 lair-0.053
lor-0.052
oral-0.051
 witches-0.050
Token she
Token ID673
Feature activation+1.40e+00

Pos logit contributions
 lined+0.042
 fit+0.042
cy+0.041
aved+0.040
 went+0.039
Neg logit contributions
 mistake-0.084
 lair-0.081
lor-0.079
oral-0.077
 property-0.074
Token was
Token ID373
Feature activation+6.46e-01

Pos logit contributions
 Derek+0.035
aved+0.035
 lined+0.033
 Drew+0.033
pool+0.033
Neg logit contributions
 mistake-0.049
 lair-0.042
oral-0.040
 witch-0.040
lor-0.040
Token out
Token ID503
Feature activation-4.40e-02

Pos logit contributions
 fit+0.016
 lined+0.015
 went+0.014
 started+0.014
 really+0.014
Neg logit contributions
 mistake-0.024
lor-0.024
oral-0.024
 lair-0.023
 property-0.022
Token playing
Token ID2712
Feature activation+1.10e+00

Pos logit contributions
lor+0.002
 mistake+0.002
 lair+0.002
oral+0.002
 property+0.002
Neg logit contributions
 fit-0.001
 went-0.001
cy-0.001
 started-0.001
aved-0.001
Token,
Token ID11
Feature activation+1.38e+00

Pos logit contributions
 fit+0.019
 lined+0.017
 went+0.017
 started+0.016
cy+0.016
Neg logit contributions
 mistake-0.049
 lair-0.048
lor-0.047
oral-0.047
 property-0.044
Token catch
Token ID4929
Feature activation-4.58e-01

Pos logit contributions
 mistake+0.006
 lair+0.006
 property+0.005
 witches+0.005
 privacy+0.005
Neg logit contributions
aved-0.004
 lined-0.004
cy-0.004
pool-0.003
attered-0.003
Token his
Token ID465
Feature activation+1.31e+00

Pos logit contributions
 mistake+0.017
 lair+0.015
lor+0.014
oral+0.014
 witches+0.014
Neg logit contributions
aved-0.011
 lined-0.011
cy-0.011
organ-0.011
 fit-0.010
Token things
Token ID1243
Feature activation+4.10e-01

Pos logit contributions
 fit+0.035
 lined+0.034
cy+0.034
 swapped+0.033
 went+0.033
Neg logit contributions
 mistake-0.044
lor-0.043
 lair-0.042
oral-0.041
reetings-0.038
Token,
Token ID11
Feature activation-9.61e-01

Pos logit contributions
cy+0.012
aved+0.012
 lined+0.012
organ+0.011
 swapped+0.011
Neg logit contributions
 mistake-0.013
 lair-0.011
lor-0.010
 property-0.010
oral-0.010
Token but
Token ID475
Feature activation-9.11e-01

Pos logit contributions
lor+0.041
oral+0.039
 lair+0.037
reetings+0.037
 Majesty+0.036
Neg logit contributions
 fit-0.022
 lined-0.022
 really-0.020
 went-0.020
 started-0.020
Token they
Token ID484
Feature activation+1.17e-01

Pos logit contributions
 mistake+0.040
 lair+0.040
lor+0.039
oral+0.038
 property+0.037
Neg logit contributions
 fit-0.015
 lined-0.015
 went-0.014
cy-0.013
 started-0.013
Token flew
Token ID13112
Feature activation-4.14e-01

Pos logit contributions
cy+0.002
organ+0.002
inn+0.002
aved+0.002
 swapped+0.002
Neg logit contributions
 mistake-0.005
 lair-0.004
 witches-0.004
lor-0.004
 privacy-0.004
Token everywhere
Token ID8347
Feature activation-1.79e+00

Pos logit contributions
 mistake+0.018
 lair+0.017
lor+0.016
 property+0.016
oral+0.016
Neg logit contributions
cy-0.006
aved-0.006
 lined-0.006
 swapped-0.006
organ-0.006
Token.
Token ID13
Feature activation-3.44e-01

Pos logit contributions
lor+0.082
 lair+0.081
 mistake+0.081
 Majesty+0.078
 property+0.078
Neg logit contributions
 fit-0.030
 lined-0.026
 went-0.026
 really-0.026
 started-0.025
Token The
Token ID383
Feature activation-6.92e-01

Pos logit contributions
 mistake+0.015
 lair+0.014
lor+0.014
oral+0.014
 property+0.014
Neg logit contributions
 lined-0.006
 fit-0.006
cy-0.005
 swapped-0.005
aved-0.005
Token other
Token ID584
Feature activation-7.02e-01

Pos logit contributions
lor+0.027
 lair+0.026
 mistake+0.026
reetings+0.024
oral+0.024
Neg logit contributions
 fit-0.016
 went-0.015
 swapped-0.015
 started-0.015
 lined-0.014
Token time
Token ID640
Feature activation+1.25e+00

Pos logit contributions
 mistake+0.006
 lair+0.005
lor+0.005
oral+0.005
 property+0.005
Neg logit contributions
cy-0.003
aved-0.003
 lined-0.003
 fit-0.003
 swapped-0.003
Token and
Token ID290
Feature activation+5.03e-01

Pos logit contributions
 lined+0.024
 fit+0.023
 went+0.021
 really+0.020
aved+0.020
Neg logit contributions
oral-0.057
lor-0.057
 mistake-0.057
 lair-0.053
 privacy-0.052
Token got
Token ID1392
Feature activation+3.81e-01

Pos logit contributions
 lined+0.010
 fit+0.009
cy+0.009
aved+0.009
 swapped+0.009
Neg logit contributions
 mistake-0.021
 lair-0.020
lor-0.020
oral-0.019
 property-0.018
Token very
Token ID845
Feature activation-3.40e-01

Pos logit contributions
cy+0.009
pool+0.008
aved+0.008
 lined+0.008
inn+0.008
Neg logit contributions
 mistake-0.013
 lair-0.013
 privacy-0.012
 property-0.012
 reading-0.012
Token tired
Token ID10032
Feature activation+3.18e-01

Pos logit contributions
 mistake+0.014
 lair+0.013
oral+0.012
 property+0.012
 privacy+0.012
Neg logit contributions
aved-0.006
cy-0.006
 lined-0.006
 Derek-0.006
pool-0.006
Token.
Token ID13
Feature activation+2.29e-01

Pos logit contributions
 lined+0.008
cy+0.007
 fit+0.007
aved+0.007
 swapped+0.007
Neg logit contributions
 mistake-0.012
 lair-0.011
lor-0.011
oral-0.010
 property-0.010
Token Then
Token ID3244
Feature activation+1.12e+00

Pos logit contributions
 lined+0.004
cy+0.004
 fit+0.004
aved+0.004
 swapped+0.004
Neg logit contributions
 mistake-0.010
 lair-0.009
lor-0.009
oral-0.009
 property-0.009
Token they
Token ID484
Feature activation+3.37e-02

Pos logit contributions
 lined+0.021
 fit+0.021
cy+0.021
aved+0.020
 went+0.020
Neg logit contributions
 mistake-0.047
 lair-0.045
lor-0.043
oral-0.043
 property-0.041
Token saw
Token ID2497
Feature activation-5.27e-01

Pos logit contributions
 lined+0.000
cy+0.000
 fit+0.000
aved+0.000
 swapped+0.000
Neg logit contributions
 mistake-0.002
 lair-0.002
lor-0.002
oral-0.002
 property-0.001
Token a
Token ID257
Feature activation+1.12e-01

Pos logit contributions
 mistake+0.022
 lair+0.021
lor+0.021
oral+0.021
 property+0.020
Neg logit contributions
 lined-0.009
 fit-0.009
cy-0.009
aved-0.009
 went-0.009
Token big
Token ID1263
Feature activation-1.66e-01

Pos logit contributions
 fit+0.003
 really+0.002
 went+0.002
 lined+0.002
 started+0.002
Neg logit contributions
lor-0.004
 lair-0.004
 mistake-0.004
oral-0.004
adies-0.004
Token.
Token ID13
Feature activation-5.13e-01

Pos logit contributions
 mistake+0.011
oral+0.011
lor+0.011
 lair+0.011
reetings+0.010
Neg logit contributions
 went-0.005
 started-0.004
 fit-0.004
 lined-0.004
 really-0.004
Token He
Token ID679
Feature activation-4.99e-01

Pos logit contributions
 mistake+0.025
 lair+0.023
 property+0.022
lor+0.022
 privacy+0.022
Neg logit contributions
 lined-0.008
 went-0.008
cy-0.007
 Derek-0.007
 fit-0.007
Token forgot
Token ID16453
Feature activation-8.81e-01

Pos logit contributions
 mistake+0.024
lor+0.023
 lair+0.023
oral+0.023
 property+0.022
Neg logit contributions
 went-0.008
 started-0.008
 lined-0.007
 fit-0.007
 really-0.007
Token to
Token ID284
Feature activation-2.56e-01

Pos logit contributions
lor+0.041
oral+0.040
reetings+0.037
 mistake+0.036
 lair+0.036
Neg logit contributions
 started-0.016
 went-0.016
 really-0.016
 fit-0.015
 lined-0.015
Token eat
Token ID4483
Feature activation-8.50e-01

Pos logit contributions
 lair+0.010
lor+0.010
 mistake+0.010
oral+0.010
 privacy+0.009
Neg logit contributions
 fit-0.005
 lined-0.005
 went-0.005
 swapped-0.005
cy-0.005
Token his
Token ID465
Feature activation+6.37e-02

Pos logit contributions
lor+0.040
 lair+0.039
oral+0.039
 mistake+0.039
reetings+0.036
Neg logit contributions
 fit-0.013
 really-0.011
 lined-0.011
 went-0.011
 started-0.010
Token lunch
Token ID9965
Feature activation-1.15e+00

Pos logit contributions
 fit+0.002
 went+0.001
 really+0.001
 started+0.001
cy+0.001
Neg logit contributions
lor-0.002
oral-0.002
reetings-0.002
 mistake-0.002
 lair-0.002
Token.
Token ID13
Feature activation-5.14e-01

Pos logit contributions
 mistake+0.044
 lair+0.042
oral+0.042
lor+0.041
 property+0.039
Neg logit contributions
 fit-0.026
 lined-0.025
 went-0.024
cy-0.024
 swapped-0.023
Token
Token ID198
Feature activation-9.66e-01

Pos logit contributions
 mistake+0.025
 lair+0.023
 property+0.022
lor+0.022
oral+0.022
Neg logit contributions
 lined-0.007
 went-0.006
 fit-0.006
cy-0.006
 Derek-0.006
Token
Token ID198
Feature activation-5.50e-01

Pos logit contributions
 mistake+0.042
 lair+0.040
 property+0.040
 privacy+0.039
lor+0.039
Neg logit contributions
 lined-0.016
inn-0.016
 really-0.016
 fit-0.016
 went-0.015
TokenWhen
Token ID2215
Feature activation+1.33e+00

Pos logit contributions
 mistake+0.020
 lair+0.020
lor+0.018
 privacy+0.018
 property+0.018
Neg logit contributions
 lined-0.013
inn-0.013
 fit-0.013
 went-0.013
cy-0.013
Token hear
Token ID3285
Feature activation-7.97e-01

Pos logit contributions
lor+0.017
 mistake+0.017
 lair+0.016
oral+0.016
reetings+0.015
Neg logit contributions
 fit-0.012
 went-0.011
 lined-0.011
aved-0.011
 started-0.011
Token a
Token ID257
Feature activation-1.74e-01

Pos logit contributions
 mistake+0.035
 lair+0.034
lor+0.034
oral+0.033
 property+0.031
Neg logit contributions
 lined-0.013
 fit-0.013
 went-0.012
aved-0.012
cy-0.012
Token voice
Token ID3809
Feature activation+9.56e-01

Pos logit contributions
lor+0.006
 lair+0.006
 mistake+0.006
oral+0.006
 property+0.006
Neg logit contributions
 fit-0.004
 lined-0.004
 went-0.004
cy-0.004
 started-0.004
Token.
Token ID13
Feature activation+8.25e-01

Pos logit contributions
 lined+0.023
cy+0.023
aved+0.023
 fit+0.022
 swapped+0.022
Neg logit contributions
 mistake-0.035
 lair-0.032
lor-0.031
oral-0.031
 property-0.029
Token It
Token ID632
Feature activation+1.15e+00

Pos logit contributions
aved+0.021
cy+0.021
 lined+0.021
 fit+0.020
 swapped+0.020
Neg logit contributions
 mistake-0.028
 lair-0.026
lor-0.025
oral-0.025
 property-0.024
Token is
Token ID318
Feature activation+7.69e-01

Pos logit contributions
 fit+0.040
 went+0.039
 lined+0.039
 started+0.038
cy+0.037
Neg logit contributions
 mistake-0.030
 lair-0.029
lor-0.029
oral-0.027
 property-0.026
Token the
Token ID262
Feature activation-1.43e+00

Pos logit contributions
aved+0.017
cy+0.017
 lined+0.016
pool+0.016
 swapped+0.016
Neg logit contributions
 mistake-0.031
 lair-0.028
oral-0.026
lor-0.026
 witches-0.026
Token farmer
Token ID18739
Feature activation-1.24e+00

Pos logit contributions
lor+0.046
 mistake+0.044
 lair+0.044
reetings+0.041
oral+0.041
Neg logit contributions
 fit-0.040
 lined-0.039
 went-0.038
 swapped-0.036
cy-0.036
Token.
Token ID13
Feature activation+8.66e-01

Pos logit contributions
 mistake+0.048
 lair+0.047
lor+0.047
oral+0.046
 privacy+0.043
Neg logit contributions
 fit-0.026
cy-0.025
 lined-0.025
 went-0.025
 started-0.023
Token He
Token ID679
Feature activation+4.75e-01

Pos logit contributions
 lined+0.017
aved+0.017
cy+0.017
 fit+0.016
 swapped+0.016
Neg logit contributions
 mistake-0.035
 lair-0.034
lor-0.033
oral-0.032
 property-0.031
Token says
Token ID1139
Feature activation-1.55e+00

Pos logit contributions
 went+0.015
 fit+0.014
 started+0.014
 lined+0.014
 really+0.014
Neg logit contributions
 mistake-0.014
lor-0.014
 lair-0.014
oral-0.013
 property-0.013
Token the
Token ID262
Feature activation+4.73e-01

Pos logit contributions
 mistake+0.009
lor+0.008
 lair+0.008
oral+0.008
 witches+0.008
Neg logit contributions
 lined-0.006
aved-0.006
 swapped-0.006
organ-0.006
 fit-0.006
Token o
Token ID267
Feature activation-1.52e+00

Pos logit contributions
 fit+0.012
cy+0.012
 went+0.011
 lined+0.011
 started+0.011
Neg logit contributions
 lair-0.017
 mistake-0.017
lor-0.016
oral-0.015
 privacy-0.015
Tokenat
Token ID265
Feature activation-1.09e+00

Pos logit contributions
 mistake+0.066
 lair+0.062
 privacy+0.058
 property+0.057
lor+0.057
Neg logit contributions
cy-0.028
 lined-0.027
aved-0.027
 fit-0.026
 went-0.025
Token soup
Token ID17141
Feature activation-1.37e-01

Pos logit contributions
 mistake+0.047
 lair+0.045
lor+0.044
oral+0.043
 property+0.041
Neg logit contributions
 lined-0.019
 fit-0.019
cy-0.019
aved-0.018
 swapped-0.017
Token in
Token ID287
Feature activation+4.58e-01

Pos logit contributions
 mistake+0.006
 lair+0.006
lor+0.006
oral+0.006
 property+0.006
Neg logit contributions
 fit-0.002
cy-0.002
 lined-0.002
aved-0.002
 went-0.002
Token a
Token ID257
Feature activation-8.95e-03

Pos logit contributions
 fit+0.007
cy+0.007
 lined+0.007
 went+0.007
aved+0.007
Neg logit contributions
 mistake-0.020
 lair-0.020
lor-0.020
oral-0.019
 property-0.019
Token bowl
Token ID9396
Feature activation+5.06e-01

Pos logit contributions
 mistake+0.000
 lair+0.000
lor+0.000
oral+0.000
 property+0.000
Neg logit contributions
 fit-0.000
 went-0.000
 started-0.000
 swapped-0.000
 really-0.000
Token for
Token ID329
Feature activation-2.19e-01

Pos logit contributions
 fit+0.012
 went+0.012
 lined+0.012
 started+0.011
cy+0.011
Neg logit contributions
 mistake-0.020
oral-0.019
 lair-0.019
lor-0.018
 Majesty-0.018
Token her
Token ID607
Feature activation+1.01e+00

Pos logit contributions
 mistake+0.009
 lair+0.009
lor+0.009
oral+0.009
 property+0.009
Neg logit contributions
 fit-0.004
 lined-0.004
 went-0.003
 swapped-0.003
aved-0.003
Token little
Token ID1310
Feature activation+1.25e+00

Pos logit contributions
 fit+0.024
 went+0.022
cy+0.022
 lined+0.022
 swapped+0.021
Neg logit contributions
lor-0.038
 mistake-0.037
 lair-0.036
oral-0.035
reetings-0.034
Token girl
Token ID2576
Feature activation+7.70e-01

Pos logit contributions
cy+0.030
 fit+0.030
 lined+0.030
aved+0.029
 went+0.029
Neg logit contributions
 mistake-0.045
lor-0.042
 lair-0.042
oral-0.041
 property-0.038
Token more
Token ID517
Feature activation+8.32e-01

Pos logit contributions
 really+0.016
 very+0.013
 fit+0.013
 off+0.013
 around+0.013
Neg logit contributions
 Majesty-0.033
lor-0.033
oral-0.032
 majesty-0.032
 lair-0.031
Token about
Token ID546
Feature activation+8.49e-01

Pos logit contributions
 really+0.019
 fit+0.017
 around+0.017
 lined+0.017
 very+0.017
Neg logit contributions
lor-0.035
 lair-0.031
 Majesty-0.031
oral-0.030
ich-0.030
Token the
Token ID262
Feature activation+1.48e-01

Pos logit contributions
 fit+0.011
 really+0.011
 Tim+0.010
 Bobby+0.010
 went+0.008
Neg logit contributions
lor-0.044
oral-0.043
adies-0.039
reetings-0.039
 lair-0.037
Token cookies
Token ID14746
Feature activation-7.10e-01

Pos logit contributions
 fit+0.004
 really+0.004
 went+0.004
cy+0.004
 lined+0.003
Neg logit contributions
lor-0.005
 lair-0.005
 mistake-0.004
oral-0.004
reetings-0.004
Token and
Token ID290
Feature activation+7.50e-01

Pos logit contributions
 mistake+0.028
 lair+0.027
oral+0.026
lor+0.026
 Majesty+0.024
Neg logit contributions
 fit-0.016
 lined-0.015
 went-0.015
cy-0.014
 matched-0.014
Token he
Token ID339
Feature activation-3.62e-01

Pos logit contributions
 fit+0.015
 went+0.014
 started+0.014
 lined+0.014
 swapped+0.012
Neg logit contributions
lor-0.034
oral-0.031
 lair-0.031
 mistake-0.031
reetings-0.030
Token asked
Token ID1965
Feature activation-6.91e-01

Pos logit contributions
 mistake+0.017
 lair+0.016
lor+0.016
oral+0.016
 property+0.015
Neg logit contributions
 fit-0.006
 lined-0.005
 went-0.005
 started-0.005
cy-0.005
Token his
Token ID465
Feature activation+1.07e+00

Pos logit contributions
lor+0.032
 mistake+0.031
 lair+0.031
oral+0.030
 property+0.029
Neg logit contributions
 fit-0.010
 went-0.010
 lined-0.010
 really-0.009
 started-0.009
Token mom
Token ID1995
Feature activation+4.49e-01

Pos logit contributions
 fit+0.023
 went+0.021
 lined+0.021
cy+0.020
 really+0.020
Neg logit contributions
lor-0.042
 mistake-0.040
 lair-0.039
oral-0.038
reetings-0.037
Token where
Token ID810
Feature activation+3.86e-01

Pos logit contributions
 went+0.009
 really+0.009
cy+0.009
 fit+0.009
 lined+0.008
Neg logit contributions
 mistake-0.019
lor-0.018
 lair-0.018
oral-0.017
 property-0.017
Token they
Token ID484
Feature activation+1.34e+00

Pos logit contributions
 went+0.005
 fit+0.005
 started+0.004
 lined+0.004
 really+0.004
Neg logit contributions
 mistake-0.019
lor-0.019
 lair-0.019
oral-0.018
cod-0.018
Token there
Token ID612
Feature activation+1.19e+00

Pos logit contributions
 mistake+0.019
 lair+0.017
oral+0.016
lor+0.016
reetings+0.015
Neg logit contributions
 lined-0.023
aved-0.023
 swapped-0.023
 Derek-0.022
 fit-0.022
Token was
Token ID373
Feature activation+9.88e-02

Pos logit contributions
aved+0.037
 swapped+0.036
 lined+0.034
pool+0.034
inn+0.033
Neg logit contributions
 mistake-0.036
 lair-0.031
lor-0.030
oral-0.030
 story-0.028
Token a
Token ID257
Feature activation-1.08e-01

Pos logit contributions
aved+0.003
 lined+0.003
cy+0.003
pool+0.003
 swapped+0.003
Neg logit contributions
 mistake-0.003
 lair-0.003
 reading-0.003
oral-0.003
 witches-0.002
Token little
Token ID1310
Feature activation+7.12e-01

Pos logit contributions
 mistake+0.004
 lair+0.003
oral+0.003
 witch+0.003
lor+0.003
Neg logit contributions
aved-0.003
 lined-0.003
cy-0.003
 swapped-0.003
pool-0.003
Token girl
Token ID2576
Feature activation-5.86e-01

Pos logit contributions
aved+0.023
 lined+0.023
 swapped+0.022
organ+0.022
 Derek+0.021
Neg logit contributions
 mistake-0.020
 witch-0.018
 lair-0.017
 witches-0.017
oral-0.017
Token named
Token ID3706
Feature activation-9.44e-01

Pos logit contributions
 mistake+0.020
 lair+0.019
lor+0.018
oral+0.018
 property+0.017
Neg logit contributions
 lined-0.015
aved-0.015
 fit-0.015
cy-0.015
 swapped-0.015
Token Lily
Token ID20037
Feature activation-8.60e-02

Pos logit contributions
 mistake+0.042
 lair+0.040
lor+0.039
oral+0.039
 property+0.037
Neg logit contributions
 lined-0.015
 fit-0.014
cy-0.014
aved-0.014
 swapped-0.014
Token who
Token ID508
Feature activation-5.15e-01

Pos logit contributions
 mistake+0.003
 lair+0.003
lor+0.002
 property+0.002
oral+0.002
Neg logit contributions
aved-0.002
 lined-0.002
 swapped-0.002
 Derek-0.002
 fit-0.002
Token loved
Token ID6151
Feature activation+4.04e-01

Pos logit contributions
 mistake+0.012
 living+0.011
 witch+0.011
 story+0.011
 reading+0.011
Neg logit contributions
 Derek-0.019
pool-0.018
inn-0.018
 lined-0.018
aved-0.017
Token to
Token ID284
Feature activation-1.89e-01

Pos logit contributions
 lined+0.011
cy+0.011
aved+0.011
 fit+0.011
 swapped+0.010
Neg logit contributions
 mistake-0.014
 lair-0.013
lor-0.012
oral-0.012
 property-0.011
Token play
Token ID711
Feature activation-4.26e-01

Pos logit contributions
 mistake+0.008
 lair+0.007
oral+0.007
lor+0.007
 witches+0.006
Neg logit contributions
cy-0.004
aved-0.004
 lined-0.004
inn-0.003
 swapped-0.003
Token the
Token ID262
Feature activation-1.17e-01

Pos logit contributions
 lined+0.016
 fit+0.015
cy+0.015
aved+0.015
 swapped+0.015
Neg logit contributions
 mistake-0.038
 lair-0.036
lor-0.036
oral-0.035
 property-0.033
Token net
Token ID2010
Feature activation-8.20e-01

Pos logit contributions
lor+0.004
reetings+0.004
oral+0.004
 mistake+0.004
land+0.004
Neg logit contributions
 fit-0.003
 really-0.003
 started-0.003
 went-0.003
 lined-0.003
Token.
Token ID13
Feature activation+9.62e-02

Pos logit contributions
 mistake+0.032
oral+0.031
lor+0.030
 lair+0.030
 Majesty+0.028
Neg logit contributions
 fit-0.018
 lined-0.017
 went-0.017
cy-0.017
aved-0.017
Token
Token ID198
Feature activation-4.60e-01

Pos logit contributions
 fit+0.002
 lined+0.002
cy+0.002
aved+0.002
 swapped+0.002
Neg logit contributions
 mistake-0.003
 lair-0.003
lor-0.003
oral-0.003
 property-0.003
Token
Token ID198
Feature activation+4.00e-03

Pos logit contributions
 mistake+0.017
 lair+0.017
lor+0.016
 property+0.015
oral+0.015
Neg logit contributions
 lined-0.010
 fit-0.010
aved-0.010
cy-0.010
 went-0.010
TokenO
Token ID46
Feature activation-3.38e-01

Pos logit contributions
 lined+0.000
cy+0.000
aved+0.000
 fit+0.000
 swapped+0.000
Neg logit contributions
 mistake-0.000
 lair-0.000
lor-0.000
oral-0.000
 property-0.000
Tokenll
Token ID297
Feature activation-1.58e+00

Pos logit contributions
 mistake+0.010
 lair+0.010
lor+0.009
oral+0.009
 property+0.008
Neg logit contributions
 lined-0.010
cy-0.010
 fit-0.010
aved-0.010
 swapped-0.010
Tokenie
Token ID494
Feature activation-1.11e+00

Pos logit contributions
 mistake+0.067
 lair+0.064
lor+0.063
oral+0.062
 property+0.059
Neg logit contributions
cy-0.028
 lined-0.028
 fit-0.028
aved-0.027
 went-0.026
Token was
Token ID373
Feature activation+1.06e-02

Pos logit contributions
 mistake+0.048
 lair+0.047
lor+0.046
oral+0.046
 property+0.044
Neg logit contributions
 went-0.021
 started-0.021
 fit-0.020
 lined-0.019
 really-0.018
Token grateful
Token ID14066
Feature activation-7.25e-01

Pos logit contributions
cy+0.000
 lined+0.000
aved+0.000
 fit+0.000
 swapped+0.000
Neg logit contributions
 mistake-0.000
 lair-0.000
lor-0.000
oral-0.000
 property-0.000
Token and
Token ID290
Feature activation-2.86e-01

Pos logit contributions
lor+0.030
 lair+0.029
oral+0.029
 Majesty+0.028
 mistake+0.028
Neg logit contributions
 fit-0.014
 really-0.014
 went-0.014
 started-0.013
 lined-0.013
Token Josh
Token ID8518
Feature activation-8.78e-01

Pos logit contributions
 mistake+0.023
 lair+0.021
lor+0.021
oral+0.020
 property+0.019
Neg logit contributions
 lined-0.014
 fit-0.014
aved-0.014
cy-0.014
 swapped-0.014
Token and
Token ID290
Feature activation+2.97e-01

Pos logit contributions
 mistake+0.034
 lair+0.033
lor+0.033
oral+0.033
 conclusions+0.031
Neg logit contributions
 lined-0.019
 went-0.019
 really-0.018
 fit-0.018
cy-0.018
Token Anne
Token ID15397
Feature activation+1.62e-01

Pos logit contributions
aved+0.008
cy+0.007
inn+0.007
 lined+0.007
pool+0.007
Neg logit contributions
 mistake-0.010
 lair-0.009
 witches-0.009
 Majesty-0.009
 witch-0.008
Token.
Token ID13
Feature activation+2.78e-01

Pos logit contributions
 lined+0.004
 fit+0.004
cy+0.004
aved+0.004
 swapped+0.004
Neg logit contributions
 mistake-0.006
 lair-0.006
lor-0.005
oral-0.005
 property-0.005
Token Josh
Token ID8518
Feature activation-6.20e-01

Pos logit contributions
 lined+0.007
cy+0.006
 fit+0.006
 went+0.006
aved+0.006
Neg logit contributions
 mistake-0.010
 lair-0.010
lor-0.010
oral-0.009
 property-0.009
Token was
Token ID373
Feature activation-2.93e-01

Pos logit contributions
 mistake+0.026
 lair+0.024
lor+0.024
oral+0.023
 property+0.022
Neg logit contributions
 lined-0.013
 fit-0.012
 went-0.012
cy-0.012
 started-0.011
Token a
Token ID257
Feature activation-6.23e-01

Pos logit contributions
 mistake+0.010
 lair+0.010
 reading+0.010
 witches+0.009
 witch+0.009
Neg logit contributions
aved-0.007
pool-0.007
cy-0.007
 swapped-0.006
attered-0.006
Token big
Token ID1263
Feature activation-1.16e-01

Pos logit contributions
 mistake+0.021
 lair+0.020
oral+0.019
lor+0.018
 witch+0.018
Neg logit contributions
aved-0.016
 lined-0.016
 swapped-0.015
cy-0.015
pool-0.015
Token,
Token ID11
Feature activation-1.48e+00

Pos logit contributions
lor+0.004
 mistake+0.004
 lair+0.004
oral+0.004
 privacy+0.004
Neg logit contributions
cy-0.003
 fit-0.003
 lined-0.003
 went-0.003
 really-0.003
Token strong
Token ID1913
Feature activation-4.62e-01

Pos logit contributions
 mistake+0.062
 lair+0.059
lor+0.059
oral+0.057
 property+0.054
Neg logit contributions
 fit-0.028
 lined-0.028
cy-0.027
aved-0.026
 went-0.026
Token elephant
Token ID20950
Feature activation-9.63e-01

Pos logit contributions
lor+0.017
 mistake+0.015
 lair+0.015
oral+0.014
reetings+0.014
Neg logit contributions
 lined-0.013
cy-0.013
 fit-0.013
 went-0.012
ny-0.012
TokenAnna
Token ID31160
Feature activation-2.17e-01

Pos logit contributions
 lair+0.007
 mistake+0.007
lor+0.007
 property+0.007
 Majesty+0.007
Neg logit contributions
cy-0.005
 fit-0.005
 went-0.005
 swapped-0.005
 lined-0.005
Token and
Token ID290
Feature activation+5.36e-01

Pos logit contributions
 lair+0.008
lor+0.008
 mistake+0.008
reetings+0.007
 privacy+0.007
Neg logit contributions
 went-0.005
cy-0.005
 started-0.005
 lined-0.005
 fit-0.005
Token Ben
Token ID3932
Feature activation+1.24e+00

Pos logit contributions
 lined+0.012
aved+0.012
 fit+0.011
cy+0.011
 swapped+0.011
Neg logit contributions
 mistake-0.021
 lair-0.019
oral-0.018
 witches-0.018
lor-0.018
Token nod
Token ID18666
Feature activation+1.24e-01

Pos logit contributions
aved+0.032
 lined+0.032
inn+0.032
cy+0.031
 Derek+0.031
Neg logit contributions
 mistake-0.042
 lair-0.036
oral-0.036
 property-0.035
 privacy-0.034
Token.
Token ID13
Feature activation+1.04e+00

Pos logit contributions
 went+0.002
 fit+0.002
cy+0.002
 started+0.002
 really+0.002
Neg logit contributions
lor-0.005
 lair-0.005
 mistake-0.005
 Majesty-0.005
oral-0.005
Token They
Token ID1119
Feature activation-3.59e-01

Pos logit contributions
 lined+0.019
 fit+0.019
aved+0.019
cy+0.018
 swapped+0.018
Neg logit contributions
 mistake-0.044
 lair-0.042
lor-0.040
oral-0.040
 property-0.038
Token sit
Token ID1650
Feature activation-5.46e-01

Pos logit contributions
 mistake+0.012
 lair+0.010
oral+0.010
 privacy+0.010
 property+0.010
Neg logit contributions
inn-0.009
cy-0.009
 lined-0.009
pool-0.009
aved-0.009
Token next
Token ID1306
Feature activation-3.41e-01

Pos logit contributions
 mistake+0.024
 lair+0.023
lor+0.023
oral+0.022
 property+0.021
Neg logit contributions
cy-0.010
 fit-0.009
 lined-0.009
 went-0.009
aved-0.009
Token to
Token ID284
Feature activation+2.59e-01

Pos logit contributions
 mistake+0.012
 lair+0.012
lor+0.012
oral+0.011
 property+0.011
Neg logit contributions
 lined-0.008
cy-0.008
 fit-0.008
 went-0.008
aved-0.008
Token Mom
Token ID11254
Feature activation+3.54e-01

Pos logit contributions
 lined+0.005
aved+0.005
 fit+0.004
cy+0.004
 swapped+0.004
Neg logit contributions
 mistake-0.011
 lair-0.011
lor-0.010
oral-0.010
 property-0.010
Tokenmy
Token ID1820
Feature activation-1.42e-01

Pos logit contributions
aved+0.009
 lined+0.009
 fit+0.009
 swapped+0.009
cy+0.009
Neg logit contributions
 mistake-0.012
 lair-0.011
lor-0.011
oral-0.010
 property-0.010
Token stealing
Token ID16568
Feature activation-3.01e-01

Pos logit contributions
 fit+0.014
 lined+0.014
 really+0.013
 started+0.013
 went+0.013
Neg logit contributions
oral-0.021
lor-0.021
 mistake-0.020
 lair-0.020
 privacy-0.019
Token honey
Token ID12498
Feature activation-2.83e-01

Pos logit contributions
 lair+0.013
lor+0.013
oral+0.013
 mistake+0.012
 Majesty+0.012
Neg logit contributions
 lined-0.005
 fit-0.005
 went-0.005
 started-0.005
 really-0.005
Token from
Token ID422
Feature activation+8.72e-01

Pos logit contributions
 mistake+0.011
oral+0.011
 lair+0.011
lor+0.010
 Majesty+0.010
Neg logit contributions
 fit-0.006
 lined-0.006
 went-0.006
 started-0.005
cy-0.005
Token the
Token ID262
Feature activation-2.84e-01

Pos logit contributions
 fit+0.012
 lined+0.011
cy+0.011
 went+0.011
 Derek+0.010
Neg logit contributions
lor-0.040
 mistake-0.040
oral-0.039
 lair-0.039
reetings-0.036
Token be
Token ID307
Feature activation+2.31e-01

Pos logit contributions
lor+0.011
asks+0.009
oral+0.009
reetings+0.009
 mistake+0.009
Neg logit contributions
 fit-0.008
 went-0.007
 really-0.007
 started-0.007
cy-0.006
Tokeneh
Token ID17231
Feature activation-2.49e+00

Pos logit contributions
 lined+0.006
 fit+0.006
 started+0.006
 went+0.006
 really+0.006
Neg logit contributions
 lair-0.008
 mistake-0.008
oral-0.008
lor-0.008
 privacy-0.008
Tokenives
Token ID1083
Feature activation-2.59e-02

Pos logit contributions
 mistake+0.099
 lair+0.089
oral+0.085
lor+0.084
 privacy+0.082
Neg logit contributions
cy-0.059
aved-0.057
 fit-0.053
 lined-0.053
 swapped-0.053
Token without
Token ID1231
Feature activation+1.69e-01

Pos logit contributions
 mistake+0.001
 lair+0.001
oral+0.001
lor+0.001
 Majesty+0.001
Neg logit contributions
 fit-0.001
 went-0.001
 lined-0.001
 started-0.000
cy-0.000
Token permission
Token ID7170
Feature activation-6.74e-01

Pos logit contributions
 fit+0.004
 lined+0.004
 went+0.004
 started+0.004
 swapped+0.004
Neg logit contributions
lor-0.006
oral-0.006
 mistake-0.006
 lair-0.006
reetings-0.005
Token.
Token ID13
Feature activation-3.78e-01

Pos logit contributions
 lair+0.029
 mistake+0.029
oral+0.028
lor+0.028
 Majesty+0.027
Neg logit contributions
 fit-0.012
 lined-0.011
 went-0.011
 really-0.011
 started-0.011
Token She
Token ID1375
Feature activation-6.98e-02

Pos logit contributions
 mistake+0.017
 lair+0.016
lor+0.016
oral+0.015
 property+0.015
Neg logit contributions
 lined-0.006
 Derek-0.006
cy-0.005
 went-0.005
 really-0.005
Token,
Token ID11
Feature activation-8.83e-01

Pos logit contributions
 lined+0.007
 fit+0.007
cy+0.007
aved+0.007
 went+0.007
Neg logit contributions
 mistake-0.015
 lair-0.014
lor-0.014
oral-0.013
 property-0.013
Token there
Token ID612
Feature activation+1.02e+00

Pos logit contributions
 mistake+0.026
 lair+0.024
lor+0.023
oral+0.022
 property+0.021
Neg logit contributions
 lined-0.026
aved-0.026
 swapped-0.026
 fit-0.025
 Derek-0.025
Token was
Token ID373
Feature activation-2.36e-01

Pos logit contributions
aved+0.028
 swapped+0.028
 lined+0.027
pool+0.026
cy+0.026
Neg logit contributions
 mistake-0.033
 lair-0.029
lor-0.029
oral-0.028
 property-0.026
Token a
Token ID257
Feature activation-1.91e-01

Pos logit contributions
 mistake+0.007
 lair+0.007
 reading+0.006
 witches+0.006
oral+0.006
Neg logit contributions
aved-0.007
 lined-0.007
pool-0.007
cy-0.006
 Derek-0.006
Token big
Token ID1263
Feature activation-1.76e-01

Pos logit contributions
 mistake+0.006
 lair+0.005
oral+0.005
 witch+0.005
lor+0.005
Neg logit contributions
aved-0.006
 lined-0.006
cy-0.005
 swapped-0.005
pool-0.005
Token ostr
Token ID42387
Feature activation-2.58e+00

Pos logit contributions
 mistake+0.006
lor+0.006
 lair+0.006
oral+0.005
 property+0.005
Neg logit contributions
 fit-0.005
cy-0.005
 lined-0.005
 went-0.004
aved-0.004
Tokenich
Token ID488
Feature activation-2.28e+00

Pos logit contributions
 mistake+0.059
oral+0.050
 lair+0.049
 privacy+0.048
reetings+0.046
Neg logit contributions
cy-0.095
aved-0.093
bo-0.091
 lined-0.088
 fit-0.088
Token who
Token ID508
Feature activation-5.75e-01

Pos logit contributions
 mistake+0.095
 lair+0.090
oral+0.090
lor+0.089
reetings+0.086
Neg logit contributions
 went-0.045
 fit-0.043
cy-0.042
 really-0.041
 lined-0.041
Token was
Token ID373
Feature activation+3.28e-01

Pos logit contributions
 mistake+0.016
 lair+0.016
lor+0.015
 property+0.014
 privacy+0.014
Neg logit contributions
 Derek-0.017
inn-0.016
 swapped-0.016
 lined-0.016
 Drew-0.016
Token very
Token ID845
Feature activation-1.00e+00

Pos logit contributions
cy+0.004
aved+0.004
 lined+0.004
 swapped+0.004
 fit+0.004
Neg logit contributions
 mistake-0.015
 lair-0.015
lor-0.014
oral-0.014
 property-0.014
Token proud
Token ID6613
Feature activation+9.49e-01

Pos logit contributions
 mistake+0.035
 lair+0.033
oral+0.030
lor+0.030
 property+0.030
Neg logit contributions
aved-0.024
 lined-0.024
cy-0.024
 swapped-0.024
organ-0.023
Token She
Token ID1375
Feature activation-1.94e-01

Pos logit contributions
 lined+0.021
aved+0.020
 fit+0.020
cy+0.020
 swapped+0.020
Neg logit contributions
 mistake-0.030
 lair-0.029
lor-0.028
oral-0.027
 witches-0.026
Token sees
Token ID7224
Feature activation-9.21e-01

Pos logit contributions
lor+0.006
reetings+0.005
oral+0.005
 mistake+0.005
land+0.005
Neg logit contributions
 went-0.007
 started-0.006
 fit-0.006
cy-0.006
 swapped-0.006
Token him
Token ID683
Feature activation-5.70e-02

Pos logit contributions
 mistake+0.034
 lair+0.032
 witches+0.029
lor+0.029
oral+0.029
Neg logit contributions
 lined-0.022
aved-0.022
cy-0.021
 fit-0.021
 swapped-0.020
Token under
Token ID739
Feature activation-1.88e+00

Pos logit contributions
lor+0.002
 mistake+0.002
oral+0.002
 lair+0.002
 Majesty+0.002
Neg logit contributions
 fit-0.001
 went-0.001
 lined-0.001
cy-0.001
aved-0.001
Token the
Token ID262
Feature activation-6.48e-01

Pos logit contributions
lor+0.095
oral+0.095
reetings+0.094
 mistake+0.094
 lair+0.091
Neg logit contributions
 fit-0.018
 went-0.016
 lined-0.014
 started-0.013
 really-0.013
Token bed
Token ID3996
Feature activation-2.47e+00

Pos logit contributions
lor+0.018
 mistake+0.018
 lair+0.018
oral+0.018
reetings+0.018
Neg logit contributions
 fit-0.021
 went-0.019
 lined-0.019
 started-0.019
cy-0.018
Token.
Token ID13
Feature activation+9.35e-01

Pos logit contributions
 mistake+0.112
reetings+0.109
 Majesty+0.109
oral+0.108
 lair+0.108
Neg logit contributions
 fit-0.038
 went-0.034
 lined-0.032
 started-0.032
 really-0.029
Token She
Token ID1375
Feature activation-1.09e-01

Pos logit contributions
 lined+0.021
aved+0.021
 fit+0.021
cy+0.021
 swapped+0.020
Neg logit contributions
 mistake-0.035
 lair-0.033
lor-0.032
oral-0.032
 property-0.030
Token sees
Token ID7224
Feature activation-8.65e-01

Pos logit contributions
lor+0.004
reetings+0.004
oral+0.003
land+0.003
 property+0.003
Neg logit contributions
 went-0.003
 started-0.003
 fit-0.003
 looked-0.003
 swapped-0.003
Token the
Token ID262
Feature activation-7.36e-01

Pos logit contributions
 mistake+0.032
 lair+0.030
lor+0.029
oral+0.028
 witches+0.028
Neg logit contributions
 lined-0.020
aved-0.020
cy-0.020
 fit-0.019
 swapped-0.019
Token blanket
Token ID18447
Feature activation-8.69e-01

Pos logit contributions
lor+0.028
reetings+0.028
cod+0.025
 lair+0.025
oral+0.025
Neg logit contributions
 fit-0.019
 went-0.016
 started-0.016
 really-0.015
 swapped-0.015
Token was
Token ID373
Feature activation+6.41e-01

Pos logit contributions
 mistake+0.036
 lair+0.032
lor+0.031
oral+0.030
 property+0.030
Neg logit contributions
 lined-0.023
aved-0.022
 swapped-0.022
pool-0.021
inn-0.021
Token still
Token ID991
Feature activation-3.14e-01

Pos logit contributions
 lined+0.015
 fit+0.015
cy+0.015
aved+0.015
 swapped+0.014
Neg logit contributions
 mistake-0.024
 lair-0.022
lor-0.022
oral-0.021
 property-0.020
Token wearing
Token ID5762
Feature activation+6.13e-02

Pos logit contributions
 mistake+0.012
lor+0.012
 lair+0.012
oral+0.012
 property+0.011
Neg logit contributions
 fit-0.007
 lined-0.007
cy-0.006
 went-0.006
aved-0.006
Token her
Token ID607
Feature activation+5.22e-02

Pos logit contributions
 fit+0.001
 lined+0.001
aved+0.001
 swapped+0.001
 went+0.001
Neg logit contributions
 mistake-0.003
 lair-0.003
oral-0.003
lor-0.003
 property-0.002
Token p
Token ID279
Feature activation+2.52e-01

Pos logit contributions
 fit+0.001
 really+0.001
 matched+0.001
 went+0.001
 started+0.001
Neg logit contributions
 lair-0.002
oral-0.002
 mistake-0.002
lor-0.002
reetings-0.002
Tokenaj
Token ID1228
Feature activation-2.36e+00

Pos logit contributions
cy+0.009
aved+0.008
 lined+0.008
 fit+0.008
 swapped+0.008
Neg logit contributions
 mistake-0.007
 lair-0.006
oral-0.006
 witches-0.006
lor-0.006
Tokenamas
Token ID17485
Feature activation-3.71e-02

Pos logit contributions
 mistake+0.051
lor+0.049
 lair+0.047
oral+0.042
land+0.039
Neg logit contributions
cy-0.090
 lined-0.090
aved-0.090
 fit-0.088
 swapped-0.086
Token.
Token ID13
Feature activation-2.72e-01

Pos logit contributions
 mistake+0.001
 lair+0.001
lor+0.001
oral+0.001
 property+0.001
Neg logit contributions
 lined-0.001
 fit-0.001
cy-0.001
aved-0.001
 swapped-0.001
Token
Token ID220
Feature activation+6.36e-03

Pos logit contributions
 mistake+0.012
 lair+0.012
lor+0.011
oral+0.011
 property+0.011
Neg logit contributions
 went-0.005
 lined-0.004
 swapped-0.004
cy-0.004
 started-0.004
Token
Token ID198
Feature activation-5.27e-01

Pos logit contributions
 lined+0.000
 fit+0.000
cy+0.000
aved+0.000
 went+0.000
Neg logit contributions
 mistake-0.000
 lair-0.000
lor-0.000
oral-0.000
 property-0.000
Token
Token ID198
Feature activation-1.50e-01

Pos logit contributions
 mistake+0.019
 lair+0.019
lor+0.018
oral+0.017
 property+0.017
Neg logit contributions
 lined-0.012
 fit-0.012
cy-0.012
aved-0.012
 went-0.012
Token,
Token ID11
Feature activation-2.40e-01

Pos logit contributions
 mistake+0.024
lor+0.021
 lair+0.021
oral+0.020
 Majesty+0.020
Neg logit contributions
aved-0.018
 lined-0.018
 fit-0.017
 swapped-0.017
cy-0.017
Token mom
Token ID1995
Feature activation-1.11e+00

Pos logit contributions
 lair+0.011
 mistake+0.011
lor+0.010
oral+0.010
 property+0.010
Neg logit contributions
 fit-0.004
 went-0.004
 lined-0.003
cy-0.003
 started-0.003
Tokenmy
Token ID1820
Feature activation-4.59e-01

Pos logit contributions
 mistake+0.046
 lair+0.046
lor+0.044
oral+0.044
 privacy+0.041
Neg logit contributions
 lined-0.021
cy-0.021
 went-0.020
 fit-0.020
 started-0.019
Token!
Token ID0
Feature activation-9.72e-01

Pos logit contributions
 mistake+0.016
 lair+0.015
lor+0.015
oral+0.015
 property+0.013
Neg logit contributions
 lined-0.012
 fit-0.012
cy-0.012
 went-0.012
aved-0.011
Token The
Token ID383
Feature activation-7.11e-01

Pos logit contributions
 mistake+0.039
 lair+0.036
lor+0.033
oral+0.033
 property+0.032
Neg logit contributions
cy-0.023
 went-0.022
 lined-0.021
 fit-0.021
aved-0.021
Token ostr
Token ID42387
Feature activation-2.47e+00

Pos logit contributions
 lair+0.027
 mistake+0.026
lor+0.026
oral+0.024
joy+0.023
Neg logit contributions
 went-0.016
 fit-0.016
 lined-0.016
cy-0.015
 really-0.015
Tokenich
Token ID488
Feature activation-1.61e+00

Pos logit contributions
 mistake+0.064
 lair+0.055
oral+0.053
 conclusions+0.049
 heaven+0.048
Neg logit contributions
cy-0.085
bo-0.082
aved-0.082
 lined-0.079
 went-0.078
Token is
Token ID318
Feature activation+1.76e-01

Pos logit contributions
 mistake+0.067
oral+0.066
 lair+0.066
lor+0.064
joy+0.062
Neg logit contributions
 went-0.038
 started-0.037
 fit-0.036
 got-0.034
 lined-0.033
Token so
Token ID523
Feature activation-4.20e-01

Pos logit contributions
 fit+0.004
 lined+0.004
 really+0.003
 went+0.003
 started+0.003
Neg logit contributions
oral-0.007
 mistake-0.007
lor-0.007
 lair-0.007
 privacy-0.006
Token fat
Token ID3735
Feature activation+7.85e-01

Pos logit contributions
 mistake+0.018
 lair+0.017
lor+0.017
oral+0.016
 property+0.015
Neg logit contributions
 fit-0.008
 lined-0.008
cy-0.008
aved-0.008
 went-0.007
Token!"
Token ID2474
Feature activation-1.50e-01

Pos logit contributions
 lined+0.023
 fit+0.023
 went+0.021
 really+0.021
 started+0.020
Neg logit contributions
 mistake-0.027
oral-0.026
lor-0.025
 lair-0.025
 Majesty-0.023