TOP ACTIVATIONS
MAX = 0.030

. " RE VERS E , huh ? That
important job . She was tasked with completing a peaceful mission
ep inside and su pr ised himself when he saw that
insisting . Finally , his owner gave in and bought him
. Finally his owner got an idea . He
lucky . Fore shadow ing or setup and payoff
and low until they came upon the most unexpected place .
Alice , rain bows are way too special . You 'll
monsters or traps or th orns !" They held
. " Check mate !" declared Elsa . Her
said that it would be way more fun if everyone got
Max . When Max 's owner went to get him ,
in circles . Thankfully his owner saw him and ran over
. She played around for hours , and she made a
. She had not asked permission before taking it . She
. The shop owner hears the noise and comes
Finally , her owner walked up to Sprite and
Finally , Max 's owner came and opened the door
means that Mitt ens ' owner could always count on her
and bark ed until his owner gave him a treat to

MIN ACTIVATIONS
MIN = -0.073

When Daisy put the magn ifying glass down , the cater
flashlight , and a magn ifying glass . As
She takes out a magn ifying glass . She looks at
, handing him the magn ifying glass . She looked around
He took out a magn ifying glass and a notebook and
He brought his magn ifying glass and looked closely .
the watch with the magn ifying glass . He saw the
the necklace and the magn ifying glass and started to examine
as he opened the ch ocol ates and started eating them
it she had a magn ifying glass , a tiny torch
said , holding the magn ifying glass over a black ant
He pointed his magn ifying glass at different classmates and
, next to her magn ifying glass . She looked at
She took out a magn ifying glass and looked very carefully
a flashlight and a magn ifying glass to examine their finds
borrowed her mom 's magn ifying glass to take a closer
and handed him the magn ifying glass . He
Everyday he would take the av oc ados and scatter them
hat and grabbed her magn ifying glass . She was ready
cried , dropping the magn ifying glass . "

INTERVAL 0.004 - 0.030
CONTAINS 18.527%

coming . The storm was very big and scary . Mark
Once upon a time , there was a monkey named Max
They bite Tom 's hand . Tom drops the branch and
After a few minutes , Max 's paws th awed and
ered his parents until he finally got his own medal ,

INTERVAL -0.021 - 0.004
CONTAINS 81.439%

do it , John . You must keep your newfound strength
world and have some fun !" Anna said . She picked
mom my said , " Not now , little one .
a boy . He had two legs and loved to play
an idea . They decided to make the water even more

INTERVAL -0.047 - -0.021
CONTAINS 0.034%

mom found an old ash tr ay . She told Lily
and everything looked clean . Little Lucy walked to the park
stumbled across a big dish washer . It was so excited
was hungry . The re ind eer knew where to find
and looked out of the window . He saw some friendly

INTERVAL -0.073 - -0.047
CONTAINS 0.000%

She took out a magn ifying glass and looked very carefully
it she had a magn ifying glass , a tiny torch
hat and grabbed her magn ifying glass . She was ready
Everyday he would take the av oc ados and scatter them
He pointed his magn ifying glass at different classmates and
Token.
Token ID13
Feature activation+3.34e-04

Pos logit contributions
 other+0.004
L+0.003
 things+0.003
 cars+0.003
Tom+0.003
Neg logit contributions
 extingu-0.013
ccoli-0.012
lette-0.012
work-0.012
 pleasure-0.012
Token
Token ID198
Feature activation-8.34e-04

Pos logit contributions
 other+0.000
 things+0.000
Stop+0.000
 cars+0.000
 balls+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
 pleasure-0.001
lette-0.001
 hyd-0.001
Token
Token ID198
Feature activation-3.12e-04

Pos logit contributions
 extingu+0.002
ccoli+0.002
ugg+0.002
lette+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
 things-0.001
Stop-0.001
 cars-0.001
 balls-0.001
Token"
Token ID1
Feature activation+3.81e-03

Pos logit contributions
 extingu+0.001
 hyd+0.001
ugg+0.001
ccoli+0.001
lette+0.001
Neg logit contributions
 other-0.000
 things-0.000
 cars-0.000
 balls-0.000
 bushes-0.000
TokenRE
Token ID2200
Feature activation+6.72e-03

Pos logit contributions
 other+0.005
 things+0.004
 cars+0.004
 balls+0.004
 me+0.004
Neg logit contributions
 extingu-0.010
 hyd-0.009
work-0.009
lette-0.009
ccoli-0.009
TokenVERS
Token ID28884
Feature activation+2.85e-02

Pos logit contributions
 other+0.015
L+0.014
 things+0.014
Tom+0.014
 cars+0.013
Neg logit contributions
 extingu-0.011
ccoli-0.010
work-0.010
lette-0.010
 pleasure-0.009
TokenE
Token ID36
Feature activation+7.83e-03

Pos logit contributions
 other+0.047
 things+0.043
 people+0.039
 she+0.039
 kids+0.039
Neg logit contributions
 extingu-0.058
work-0.055
lette-0.053
 hyd-0.052
flake-0.051
Token,
Token ID11
Feature activation+1.49e-03

Pos logit contributions
 other+0.011
L+0.010
 things+0.009
 cars+0.009
Tom+0.009
Neg logit contributions
 extingu-0.019
ccoli-0.019
lette-0.018
work-0.018
 pleasure-0.017
Token huh
Token ID24926
Feature activation+1.40e-03

Pos logit contributions
 other+0.002
Stop+0.001
L+0.001
Fine+0.001
 things+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
 hyd-0.004
 pleasure-0.004
Token?
Token ID30
Feature activation-2.22e-03

Pos logit contributions
 other+0.003
Stop+0.003
L+0.003
 bushes+0.002
 cars+0.002
Neg logit contributions
 extingu-0.002
ccoli-0.002
work-0.002
lette-0.002
urger-0.002
Token That
Token ID1320
Feature activation+4.92e-03

Pos logit contributions
 crop+0.006
lette+0.006
 extingu+0.006
 pleasure+0.006
ccoli+0.006
Neg logit contributions
 other-0.002
L-0.002
 Lily-0.002
 cars-0.002
Tom-0.002
Token important
Token ID1593
Feature activation+3.75e-03

Pos logit contributions
 other+0.010
 things+0.008
 cars+0.008
 people+0.008
 me+0.008
Neg logit contributions
ccoli-0.014
 extingu-0.014
work-0.013
flake-0.013
ugg-0.013
Token job
Token ID1693
Feature activation+5.87e-03

Pos logit contributions
 other+0.006
 things+0.006
 cars+0.005
 people+0.005
L+0.005
Neg logit contributions
 extingu-0.008
ccoli-0.008
work-0.008
lette-0.008
acket-0.007
Token.
Token ID13
Feature activation-2.11e-03

Pos logit contributions
 other+0.006
 things+0.005
 cars+0.005
 people+0.005
L+0.004
Neg logit contributions
 extingu-0.017
ccoli-0.016
lette-0.016
work-0.016
 pleasure-0.016
Token She
Token ID1375
Feature activation+4.44e-03

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.005
 pleasure+0.005
work+0.005
Neg logit contributions
 other-0.003
L-0.002
 things-0.002
Stop-0.002
Tom-0.002
Token was
Token ID373
Feature activation-2.79e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
 cars+0.006
 people+0.005
Neg logit contributions
 extingu-0.011
 pleasure-0.011
lette-0.011
ccoli-0.010
work-0.010
Token tasked
Token ID23052
Feature activation+2.78e-02

Pos logit contributions
 extingu+0.008
ccoli+0.008
icol+0.008
acket+0.008
lette+0.008
Neg logit contributions
 other-0.003
 things-0.002
 cars-0.002
 she-0.002
 people-0.002
Token with
Token ID351
Feature activation+3.02e-03

Pos logit contributions
 other+0.037
L+0.031
 things+0.031
Stop+0.030
 cars+0.030
Neg logit contributions
 extingu-0.067
lette-0.066
ccoli-0.065
work-0.065
 pleasure-0.062
Token completing
Token ID14339
Feature activation-1.17e-03

Pos logit contributions
 other+0.003
 things+0.002
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.009
work-0.009
 pleasure-0.008
Token a
Token ID257
Feature activation+1.58e-04

Pos logit contributions
 extingu+0.003
ccoli+0.003
lette+0.003
work+0.003
 pleasure+0.003
Neg logit contributions
 other-0.001
L-0.001
Tom-0.001
Tim-0.001
Stop-0.001
Token peaceful
Token ID12309
Feature activation-2.93e-03

Pos logit contributions
 other+0.000
L+0.000
 things+0.000
Tom+0.000
 cars+0.000
Neg logit contributions
 extingu-0.000
ccoli-0.000
lette-0.000
work-0.000
 pleasure-0.000
Token mission
Token ID4365
Feature activation+2.70e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.004
 things-0.004
 cars-0.003
 people-0.003
 me-0.003
Tokenep
Token ID538
Feature activation+7.35e-03

Pos logit contributions
 extingu+0.002
ccoli+0.002
work+0.002
lette+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 people-0.001
Token inside
Token ID2641
Feature activation-7.36e-03

Pos logit contributions
 other+0.008
L+0.008
Tom+0.007
Tim+0.007
Stop+0.007
Neg logit contributions
 extingu-0.019
lette-0.018
work-0.018
ccoli-0.018
 pleasure-0.017
Token and
Token ID290
Feature activation+6.96e-03

Pos logit contributions
 extingu+0.020
ccoli+0.020
lette+0.019
work+0.019
 pleasure+0.019
Neg logit contributions
 other-0.007
L-0.007
Tom-0.006
Stop-0.006
Tim-0.006
Token su
Token ID424
Feature activation+7.11e-05

Pos logit contributions
 other+0.011
 things+0.009
 people+0.009
 she+0.008
 cars+0.008
Neg logit contributions
 extingu-0.018
ccoli-0.017
lette-0.017
work-0.016
ugg-0.016
Tokenpr
Token ID1050
Feature activation-3.18e-04

Pos logit contributions
 other+0.000
L+0.000
 things+0.000
Tom+0.000
 cars+0.000
Neg logit contributions
 extingu-0.000
ccoli-0.000
lette-0.000
work-0.000
 pleasure-0.000
Tokenised
Token ID1417
Feature activation+2.78e-02

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.001
L-0.001
 things-0.000
 cars-0.000
Tom-0.000
Token himself
Token ID2241
Feature activation+2.14e-03

Pos logit contributions
 other+0.035
L+0.031
Tom+0.029
Stop+0.028
Tim+0.028
Neg logit contributions
 extingu-0.071
ccoli-0.069
work-0.068
lette-0.068
 pleasure-0.065
Token when
Token ID618
Feature activation+3.62e-03

Pos logit contributions
 other+0.002
 things+0.002
 she+0.002
 cars+0.002
 people+0.002
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
 pleasure-0.006
work-0.006
Token he
Token ID339
Feature activation+4.18e-03

Pos logit contributions
L+0.003
 other+0.003
Tom+0.002
Stop+0.002
Tim+0.002
Neg logit contributions
 extingu-0.010
work-0.010
 pleasure-0.010
ccoli-0.010
 crop-0.009
Token saw
Token ID2497
Feature activation-4.88e-03

Pos logit contributions
 other+0.006
L+0.005
 things+0.005
 cars+0.004
Tom+0.004
Neg logit contributions
 extingu-0.011
ccoli-0.010
lette-0.010
work-0.010
 pleasure-0.010
Token that
Token ID326
Feature activation-3.23e-03

Pos logit contributions
 extingu+0.013
ccoli+0.013
lette+0.013
work+0.013
 pleasure+0.013
Neg logit contributions
 other-0.005
L-0.004
 things-0.004
Tom-0.004
 cars-0.004
Token insisting
Token ID23630
Feature activation-2.37e-03

Pos logit contributions
 other+0.000
L+0.000
Tom+0.000
Stop+0.000
Tim+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.000
work-0.000
 hyd-0.000
Token.
Token ID13
Feature activation+1.14e-03

Pos logit contributions
lette+0.005
 extingu+0.005
ccoli+0.005
 hyd+0.005
work+0.005
Neg logit contributions
L-0.003
Stop-0.003
 Lia-0.003
 other-0.003
Tim-0.003
Token Finally
Token ID9461
Feature activation+3.63e-04

Pos logit contributions
 other+0.002
 things+0.002
Tom+0.001
L+0.001
Stop+0.001
Neg logit contributions
ccoli-0.003
 extingu-0.003
 pleasure-0.002
lette-0.002
 crop-0.002
Token,
Token ID11
Feature activation-4.87e-03

Pos logit contributions
 other+0.001
Stop+0.001
 cars+0.001
L+0.001
Tom+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
 crop-0.001
 pleasure-0.001
icol-0.001
Token his
Token ID465
Feature activation+5.30e-03

Pos logit contributions
 extingu+0.012
ccoli+0.011
 crop+0.011
 hyd+0.011
lette+0.011
Neg logit contributions
 other-0.006
Stop-0.005
Tom-0.005
 cars-0.005
L-0.005
Token owner
Token ID4870
Feature activation+2.70e-02

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
Stop+0.006
Tom+0.006
Neg logit contributions
 extingu-0.013
ccoli-0.012
lette-0.012
work-0.012
 pleasure-0.012
Token gave
Token ID2921
Feature activation+6.63e-04

Pos logit contributions
 other+0.044
L+0.039
 things+0.038
Tom+0.037
 cars+0.037
Neg logit contributions
 extingu-0.060
ccoli-0.059
lette-0.058
work-0.057
 pleasure-0.056
Token in
Token ID287
Feature activation-8.47e-03

Pos logit contributions
 other+0.001
Tom+0.001
L+0.001
 cars+0.001
Stop+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
 pleasure-0.002
work-0.002
Token and
Token ID290
Feature activation+6.92e-03

Pos logit contributions
 extingu+0.022
ccoli+0.021
lette+0.021
work+0.021
 pleasure+0.020
Neg logit contributions
 other-0.011
L-0.009
 things-0.009
 cars-0.008
Tom-0.008
Token bought
Token ID5839
Feature activation-1.46e-03

Pos logit contributions
 other+0.010
L+0.008
 things+0.008
 cars+0.008
Tom+0.008
Neg logit contributions
 extingu-0.017
ccoli-0.017
lette-0.017
work-0.016
 pleasure-0.016
Token him
Token ID683
Feature activation-6.16e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
 pleasure+0.004
work+0.004
Neg logit contributions
 other-0.002
L-0.001
Tom-0.001
Stop-0.001
 things-0.001
Token.
Token ID13
Feature activation-2.26e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.004
 pleasure+0.003
Neg logit contributions
 other-0.001
L-0.001
 things-0.001
Tom-0.001
 cars-0.001
Token
Token ID198
Feature activation-2.20e-03

Pos logit contributions
ccoli+0.005
 extingu+0.005
 pleasure+0.005
ugg+0.005
lette+0.005
Neg logit contributions
 other-0.003
 things-0.003
Stop-0.003
 people-0.003
 cars-0.003
Token
Token ID198
Feature activation+5.03e-04

Pos logit contributions
ccoli+0.005
 extingu+0.005
ugg+0.005
lette+0.005
 pleasure+0.005
Neg logit contributions
 other-0.003
 things-0.003
 cars-0.003
Stop-0.003
 people-0.003
TokenFinally
Token ID11158
Feature activation+8.76e-04

Pos logit contributions
 things+0.001
 other+0.001
 cars+0.001
 kids+0.001
 people+0.001
Neg logit contributions
ugg-0.001
ccoli-0.001
 extingu-0.001
lette-0.001
 hyd-0.001
Token his
Token ID465
Feature activation+3.87e-03

Pos logit contributions
 other+0.001
Stop+0.001
L+0.001
Tom+0.001
Tim+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
 pleasure-0.002
work-0.002
lette-0.002
Token owner
Token ID4870
Feature activation+2.70e-02

Pos logit contributions
 other+0.005
L+0.005
 things+0.005
Stop+0.005
Tim+0.005
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.009
 pleasure-0.009
work-0.008
Token got
Token ID1392
Feature activation-1.54e-03

Pos logit contributions
 other+0.042
L+0.036
 things+0.035
Tom+0.034
 cars+0.034
Neg logit contributions
 extingu-0.063
ccoli-0.061
lette-0.060
work-0.059
 pleasure-0.058
Token an
Token ID281
Feature activation+6.92e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.004
icol+0.004
Neg logit contributions
 other-0.002
 things-0.002
 people-0.001
 cars-0.001
 it-0.001
Token idea
Token ID2126
Feature activation+8.32e-04

Pos logit contributions
 other+0.012
 things+0.010
 people+0.010
 cars+0.010
 kids+0.010
Neg logit contributions
ccoli-0.016
 extingu-0.015
work-0.015
flake-0.014
lette-0.014
Token.
Token ID13
Feature activation+3.39e-04

Pos logit contributions
 other+0.001
L+0.001
 things+0.001
 cars+0.001
Tom+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token He
Token ID679
Feature activation+6.42e-04

Pos logit contributions
 other+0.000
 things+0.000
L+0.000
Tom+0.000
Stop+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token lucky
Token ID9670
Feature activation-2.63e-04

Pos logit contributions
 extingu+0.021
ccoli+0.020
acket+0.020
lette+0.019
icol+0.019
Neg logit contributions
 other-0.017
 things-0.015
 people-0.014
 you-0.013
 she-0.013
Token.
Token ID13
Feature activation-2.84e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
urger+0.001
Neg logit contributions
 other-0.000
 things-0.000
 people-0.000
 cars-0.000
 kids-0.000
Token
Token ID198
Feature activation-4.27e-03

Pos logit contributions
 extingu+0.008
ccoli+0.008
lette+0.008
work+0.008
 crop+0.008
Neg logit contributions
 other-0.003
L-0.002
 things-0.002
Tom-0.002
 people-0.002
Token
Token ID198
Feature activation-2.47e-03

Pos logit contributions
 extingu+0.010
ccoli+0.010
lette+0.010
work+0.010
 pleasure+0.009
Neg logit contributions
 other-0.006
 things-0.005
L-0.005
 cars-0.005
Tom-0.005
TokenFore
Token ID16351
Feature activation+2.50e-03

Pos logit contributions
lette+0.006
 extingu+0.006
ccoli+0.006
work+0.006
 crop+0.006
Neg logit contributions
 other-0.003
 things-0.003
 cars-0.003
Stop-0.002
 people-0.002
Tokenshadow
Token ID19106
Feature activation+2.65e-02

Pos logit contributions
 other+0.004
L+0.004
 things+0.004
 cars+0.004
Tom+0.004
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
work-0.005
 pleasure-0.005
Tokening
Token ID278
Feature activation-1.59e-03

Pos logit contributions
 other+0.041
 things+0.035
L+0.035
 cars+0.034
Tom+0.034
Neg logit contributions
 extingu-0.060
ccoli-0.059
lette-0.058
work-0.058
 pleasure-0.054
Token or
Token ID393
Feature activation-1.79e-03

Pos logit contributions
 extingu+0.005
ccoli+0.004
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.002
 things-0.001
L-0.001
 people-0.001
 cars-0.001
Token setup
Token ID9058
Feature activation+3.62e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
ossom+0.004
icol+0.004
Neg logit contributions
 other-0.003
 things-0.003
 people-0.002
 cars-0.002
 me-0.002
Token and
Token ID290
Feature activation+1.21e-03

Pos logit contributions
 other+0.004
 things+0.003
L+0.003
 cars+0.003
Tom+0.003
Neg logit contributions
 extingu-0.010
ccoli-0.010
lette-0.010
work-0.009
 pleasure-0.009
Token payoff
Token ID43843
Feature activation+9.30e-03

Pos logit contributions
 other+0.003
 things+0.002
L+0.002
 people+0.002
Stop+0.002
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
ossom-0.002
work-0.002
Token and
Token ID290
Feature activation-1.95e-03

Pos logit contributions
 extingu+0.023
ccoli+0.023
lette+0.023
 pleasure+0.022
work+0.022
Neg logit contributions
 other-0.008
 things-0.007
 cars-0.006
 people-0.006
L-0.005
Token low
Token ID1877
Feature activation-2.75e-03

Pos logit contributions
ccoli+0.004
 extingu+0.004
work+0.004
lette+0.004
 pleasure+0.004
Neg logit contributions
 other-0.003
 things-0.002
L-0.002
Tom-0.002
 cars-0.002
Token until
Token ID1566
Feature activation+2.05e-03

Pos logit contributions
 extingu+0.008
lette+0.008
ccoli+0.008
work+0.008
 pleasure+0.007
Neg logit contributions
 other-0.003
 things-0.002
 cars-0.002
 people-0.002
 she-0.002
Token they
Token ID484
Feature activation-2.16e-04

Pos logit contributions
 other+0.002
L+0.001
 things+0.001
Tom+0.001
Stop+0.001
Neg logit contributions
 extingu-0.006
ccoli-0.006
work-0.006
lette-0.006
 pleasure-0.006
Token came
Token ID1625
Feature activation+4.85e-03

Pos logit contributions
 extingu+0.000
ccoli+0.000
lette+0.000
work+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
 cars-0.000
Tom-0.000
Token upon
Token ID2402
Feature activation+2.65e-02

Pos logit contributions
 other+0.008
 things+0.007
L+0.006
 cars+0.006
 people+0.006
Neg logit contributions
 extingu-0.011
ccoli-0.011
lette-0.011
work-0.011
 pleasure-0.010
Token the
Token ID262
Feature activation+5.47e-03

Pos logit contributions
 other+0.036
 things+0.030
L+0.030
 cars+0.028
Tom+0.028
Neg logit contributions
 extingu-0.067
ccoli-0.066
lette-0.064
work-0.063
 pleasure-0.061
Token most
Token ID749
Feature activation+5.16e-03

Pos logit contributions
 other+0.007
 things+0.006
L+0.006
Tim+0.006
 cars+0.006
Neg logit contributions
 extingu-0.013
lette-0.013
work-0.013
ccoli-0.012
 pleasure-0.012
Token unexpected
Token ID10059
Feature activation-1.81e-03

Pos logit contributions
 other+0.008
 things+0.007
L+0.007
 cars+0.006
 people+0.006
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.012
work-0.012
 pleasure-0.011
Token place
Token ID1295
Feature activation-4.44e-03

Pos logit contributions
 extingu+0.004
work+0.004
ccoli+0.004
lette+0.004
 pleasure+0.004
Neg logit contributions
 other-0.002
L-0.002
 cars-0.002
Tom-0.002
Tim-0.002
Token.
Token ID13
Feature activation-2.23e-03

Pos logit contributions
 extingu+0.013
ccoli+0.012
lette+0.012
 pleasure+0.012
ossom+0.012
Neg logit contributions
 other-0.005
 things-0.004
 cars-0.004
 people-0.003
 she-0.003
Token Alice
Token ID14862
Feature activation+6.76e-04

Pos logit contributions
 other+0.000
L+0.000
 things+0.000
Stop+0.000
 cars+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
 pleasure-0.001
lette-0.001
work-0.001
Token,
Token ID11
Feature activation-3.79e-03

Pos logit contributions
 other+0.001
 things+0.000
 she+0.000
 people+0.000
L+0.000
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
icol-0.002
Token rain
Token ID6290
Feature activation+4.78e-03

Pos logit contributions
 extingu+0.012
ccoli+0.011
lette+0.011
ossom+0.011
work+0.011
Neg logit contributions
 other-0.003
 things-0.003
 people-0.002
 she-0.002
 cars-0.002
Tokenbows
Token ID25435
Feature activation+2.55e-03

Pos logit contributions
 other+0.008
 things+0.007
 cars+0.007
L+0.007
 people+0.007
Neg logit contributions
 extingu-0.011
ccoli-0.010
lette-0.010
 pleasure-0.010
 crop-0.010
Token are
Token ID389
Feature activation-1.34e-03

Pos logit contributions
 other+0.003
 things+0.002
 people+0.002
 cars+0.002
 she+0.002
Neg logit contributions
 extingu-0.007
ccoli-0.007
lette-0.007
icol-0.007
work-0.007
Token way
Token ID835
Feature activation+2.61e-02

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.003
icol+0.003
Neg logit contributions
 other-0.002
 things-0.001
 people-0.001
 cars-0.001
L-0.001
Token too
Token ID1165
Feature activation-9.39e-03

Pos logit contributions
 other+0.037
 things+0.031
L+0.030
 cars+0.030
 people+0.029
Neg logit contributions
 extingu-0.065
ccoli-0.064
lette-0.062
work-0.060
 pleasure-0.059
Token special
Token ID2041
Feature activation+3.78e-03

Pos logit contributions
 extingu+0.025
ccoli+0.025
lette+0.024
work+0.023
 pleasure+0.023
Neg logit contributions
 other-0.012
 things-0.010
L-0.009
 people-0.009
 cars-0.009
Token.
Token ID13
Feature activation-1.22e-03

Pos logit contributions
 other+0.004
 things+0.003
 people+0.003
 cars+0.003
L+0.003
Neg logit contributions
 extingu-0.011
ccoli-0.010
lette-0.010
work-0.010
 pleasure-0.010
Token You
Token ID921
Feature activation+2.15e-03

Pos logit contributions
 extingu+0.003
ccoli+0.003
lette+0.003
work+0.003
 crop+0.003
Neg logit contributions
 other-0.002
L-0.001
 things-0.001
Tom-0.001
 cars-0.001
Token'll
Token ID1183
Feature activation+4.35e-03

Pos logit contributions
 other+0.003
 people+0.002
 things+0.002
 you+0.002
 it+0.002
Neg logit contributions
 extingu-0.006
icol-0.006
ccoli-0.006
urger-0.006
lette-0.006
Token monsters
Token ID10963
Feature activation+6.69e-03

Pos logit contributions
 extingu+0.002
ccoli+0.002
lette+0.002
work+0.002
 crop+0.002
Neg logit contributions
 other-0.001
L-0.001
Tom-0.001
Stop-0.001
Tim-0.001
Token or
Token ID393
Feature activation+1.39e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
Tom+0.006
 cars+0.006
Neg logit contributions
 extingu-0.018
ccoli-0.018
lette-0.018
work-0.017
 pleasure-0.017
Token traps
Token ID20348
Feature activation+2.12e-02

Pos logit contributions
 other+0.001
L+0.001
Tom+0.001
Tim+0.001
 things+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
work-0.004
lette-0.004
 pleasure-0.003
Token or
Token ID393
Feature activation+3.45e-03

Pos logit contributions
 other+0.027
L+0.026
Tim+0.024
 things+0.023
Tom+0.023
Neg logit contributions
 extingu-0.051
ccoli-0.049
work-0.048
lette-0.047
 pleasure-0.046
Token th
Token ID294
Feature activation+1.30e-02

Pos logit contributions
 other+0.004
L+0.003
Tom+0.003
 things+0.003
Tim+0.003
Neg logit contributions
 extingu-0.009
ccoli-0.009
work-0.009
lette-0.009
 pleasure-0.009
Tokenorns
Token ID19942
Feature activation+2.60e-02

Pos logit contributions
 other+0.020
L+0.018
 things+0.017
Tom+0.016
 she+0.016
Neg logit contributions
 extingu-0.030
ccoli-0.029
lette-0.028
work-0.028
 crop-0.027
Token!"
Token ID2474
Feature activation-4.00e-03

Pos logit contributions
 other+0.035
L+0.033
Tom+0.030
Tim+0.030
Stop+0.030
Neg logit contributions
 extingu-0.061
ccoli-0.060
lette-0.058
work-0.058
 crop-0.055
Token
Token ID198
Feature activation-3.07e-04

Pos logit contributions
 extingu+0.007
 pleasure+0.007
ccoli+0.007
ugg+0.007
ossom+0.007
Neg logit contributions
 other-0.006
Stop-0.006
 things-0.005
 cars-0.005
Sp-0.005
Token
Token ID198
Feature activation-1.97e-04

Pos logit contributions
 extingu+0.001
ugg+0.001
ccoli+0.001
 pleasure+0.000
ossom+0.000
Neg logit contributions
 other-0.000
 things-0.000
Stop-0.000
 cheating-0.000
 balls-0.000
TokenThey
Token ID2990
Feature activation-5.74e-03

Pos logit contributions
lette+0.000
 extingu+0.000
ossom+0.000
 hyd+0.000
ccoli+0.000
Neg logit contributions
 spiders-0.000
 snakes-0.000
 other-0.000
 cars-0.000
 things-0.000
Token held
Token ID2714
Feature activation-5.12e-03

Pos logit contributions
 extingu+0.014
ccoli+0.014
lette+0.013
work+0.013
 pleasure+0.013
Neg logit contributions
 other-0.009
L-0.007
 things-0.007
Tom-0.007
 cars-0.007
Token.
Token ID13
Feature activation+3.52e-03

Pos logit contributions
 other+0.001
 things+0.001
 people+0.001
L+0.001
 cars+0.001
Neg logit contributions
ccoli-0.004
 extingu-0.004
lette-0.004
 pleasure-0.004
ugg-0.004
Token
Token ID198
Feature activation+1.82e-03

Pos logit contributions
 other+0.004
L+0.003
 things+0.003
Tom+0.003
 cars+0.003
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.009
work-0.009
 pleasure-0.009
Token
Token ID198
Feature activation+4.92e-03

Pos logit contributions
 other+0.002
 things+0.002
 cars+0.002
 balls+0.002
 cheating+0.002
Neg logit contributions
ccoli-0.004
lette-0.004
 extingu-0.004
ossom-0.004
ugg-0.004
Token"
Token ID1
Feature activation-5.48e-03

Pos logit contributions
 other+0.005
 things+0.005
 cars+0.005
 Lia+0.004
 people+0.004
Neg logit contributions
ccoli-0.012
lette-0.012
 extingu-0.012
ossom-0.012
ugg-0.012
TokenCheck
Token ID9787
Feature activation+5.18e-03

Pos logit contributions
 extingu+0.014
lette+0.014
ccoli+0.014
work+0.014
 pleasure+0.013
Neg logit contributions
 other-0.006
 things-0.006
 cars-0.005
 people-0.005
 Lia-0.005
Tokenmate
Token ID9830
Feature activation+2.59e-02

Pos logit contributions
 other+0.008
 things+0.007
L+0.007
 people+0.006
Tom+0.006
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.012
work-0.011
 pleasure-0.011
Token!"
Token ID2474
Feature activation+1.95e-03

Pos logit contributions
 other+0.041
L+0.036
 things+0.035
Tom+0.034
 cars+0.034
Neg logit contributions
 extingu-0.058
ccoli-0.057
lette-0.056
work-0.055
 pleasure-0.053
Token declared
Token ID6875
Feature activation-3.12e-03

Pos logit contributions
 other+0.003
 things+0.003
L+0.003
 cars+0.003
Tom+0.003
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Token Elsa
Token ID19226
Feature activation-5.82e-03

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.008
work+0.008
 pleasure+0.008
Neg logit contributions
 other-0.003
 things-0.003
L-0.003
 people-0.003
 Lily-0.003
Token.
Token ID13
Feature activation+2.25e-03

Pos logit contributions
 extingu+0.018
ccoli+0.018
 pleasure+0.017
lette+0.017
icol+0.017
Neg logit contributions
 other-0.004
 things-0.003
 she-0.003
L-0.003
 people-0.003
Token Her
Token ID2332
Feature activation+4.95e-03

Pos logit contributions
 other+0.003
L+0.002
 things+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
work-0.006
 pleasure-0.006
Token said
Token ID531
Feature activation+2.37e-03

Pos logit contributions
 other+0.002
L+0.001
 cars+0.001
 things+0.001
Tom+0.001
Neg logit contributions
 extingu-0.003
lette-0.003
ccoli-0.003
 pleasure-0.003
work-0.003
Token that
Token ID326
Feature activation-4.83e-03

Pos logit contributions
 other+0.002
L+0.002
 things+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.007
ccoli-0.007
lette-0.007
work-0.007
 pleasure-0.006
Token it
Token ID340
Feature activation-1.25e-03

Pos logit contributions
 extingu+0.015
ccoli+0.014
lette+0.014
work+0.014
 pleasure+0.014
Neg logit contributions
 other-0.004
L-0.003
 things-0.003
 cars-0.003
Tom-0.003
Token would
Token ID561
Feature activation+4.06e-03

Pos logit contributions
 extingu+0.004
lette+0.004
ccoli+0.004
 pleasure+0.004
icol+0.004
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 people-0.001
Token be
Token ID307
Feature activation+3.57e-03

Pos logit contributions
 other+0.005
L+0.004
 things+0.004
 people+0.004
 cars+0.004
Neg logit contributions
 extingu-0.011
ccoli-0.011
lette-0.010
work-0.010
icol-0.010
Token way
Token ID835
Feature activation+2.55e-02

Pos logit contributions
 other+0.004
L+0.004
 things+0.003
 cars+0.003
Tom+0.003
Neg logit contributions
 extingu-0.010
ccoli-0.009
lette-0.009
work-0.009
 pleasure-0.009
Token more
Token ID517
Feature activation+1.06e-02

Pos logit contributions
 other+0.038
 things+0.032
L+0.031
 cars+0.031
 people+0.030
Neg logit contributions
 extingu-0.062
ccoli-0.061
lette-0.059
work-0.058
 pleasure-0.056
Token fun
Token ID1257
Feature activation-3.48e-03

Pos logit contributions
 other+0.011
 things+0.009
L+0.008
 people+0.008
Tom+0.007
Neg logit contributions
 extingu-0.032
ccoli-0.031
work-0.030
lette-0.030
ugg-0.029
Token if
Token ID611
Feature activation-1.32e-03

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.009
work+0.009
 pleasure+0.008
Neg logit contributions
 other-0.004
 things-0.004
L-0.004
 cars-0.004
 people-0.003
Token everyone
Token ID2506
Feature activation-2.42e-03

Pos logit contributions
 extingu+0.004
work+0.004
ccoli+0.004
lette+0.004
 pleasure+0.004
Neg logit contributions
 other-0.001
L-0.001
Tom-0.001
 cars-0.001
Stop-0.001
Token got
Token ID1392
Feature activation-1.26e-02

Pos logit contributions
 extingu+0.007
icol+0.007
ccoli+0.007
lette+0.007
urger+0.006
Neg logit contributions
 other-0.003
 things-0.002
 people-0.002
 she-0.002
 cars-0.002
Token Max
Token ID5436
Feature activation-2.30e-04

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.009
work+0.008
 pleasure+0.008
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.003
 people-0.003
Token.
Token ID13
Feature activation+2.56e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.000
 things-0.000
L-0.000
 cars-0.000
 people-0.000
Token When
Token ID1649
Feature activation-2.06e-03

Pos logit contributions
 other+0.003
 things+0.003
L+0.003
 people+0.003
Stop+0.003
Neg logit contributions
ccoli-0.006
 extingu-0.006
work-0.006
 pleasure-0.006
lette-0.006
Token Max
Token ID5436
Feature activation+5.49e-03

Pos logit contributions
ccoli+0.005
work+0.005
 extingu+0.005
 pleasure+0.004
 feast+0.004
Neg logit contributions
L-0.002
 other-0.002
Fine-0.002
Stop-0.002
Tom-0.002
Token's
Token ID338
Feature activation+9.63e-03

Pos logit contributions
 other+0.007
L+0.006
Tom+0.006
 things+0.006
 cars+0.005
Neg logit contributions
 extingu-0.014
ccoli-0.014
lette-0.014
work-0.014
 pleasure-0.013
Token owner
Token ID4870
Feature activation+2.55e-02

Pos logit contributions
 other+0.013
 things+0.011
 cars+0.011
L+0.010
 people+0.010
Neg logit contributions
 extingu-0.025
ccoli-0.024
work-0.024
lette-0.024
ossom-0.023
Token went
Token ID1816
Feature activation+2.40e-03

Pos logit contributions
 other+0.040
 things+0.035
L+0.035
 cars+0.033
Tom+0.033
Neg logit contributions
 extingu-0.057
ccoli-0.055
lette-0.054
work-0.054
 pleasure-0.052
Token to
Token ID284
Feature activation+1.23e-04

Pos logit contributions
 other+0.005
L+0.004
 things+0.004
Tom+0.004
Stop+0.004
Neg logit contributions
 extingu-0.005
work-0.004
ccoli-0.004
lette-0.004
 pleasure-0.004
Token get
Token ID651
Feature activation-4.99e-03

Pos logit contributions
 other+0.000
 things+0.000
L+0.000
 Lily+0.000
 cars+0.000
Neg logit contributions
ccoli-0.000
lette-0.000
 extingu-0.000
work-0.000
ossom-0.000
Token him
Token ID683
Feature activation-7.72e-03

Pos logit contributions
 extingu+0.014
ccoli+0.014
lette+0.014
work+0.013
 pleasure+0.013
Neg logit contributions
 other-0.005
 things-0.004
L-0.004
 cars-0.004
Tom-0.004
Token,
Token ID11
Feature activation-1.81e-03

Pos logit contributions
ccoli+0.023
lette+0.022
 extingu+0.022
icol+0.022
 crop+0.021
Neg logit contributions
 other-0.008
 things-0.007
 cars-0.006
 she-0.006
 Lily-0.006
Token in
Token ID287
Feature activation-2.93e-03

Pos logit contributions
 other+0.002
L+0.002
 things+0.001
Tom+0.001
Stop+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
work-0.004
lette-0.004
 pleasure-0.004
Token circles
Token ID13332
Feature activation+1.29e-03

Pos logit contributions
 extingu+0.007
 pleasure+0.007
lette+0.007
 crop+0.007
work+0.007
Neg logit contributions
L-0.003
 other-0.003
Stop-0.003
Tom-0.003
Fine-0.003
Token.
Token ID13
Feature activation-5.88e-04

Pos logit contributions
 other+0.001
L+0.001
 things+0.001
Tom+0.001
Stop+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.003
lette-0.003
work-0.003
 pleasure-0.003
Token Thankfully
Token ID31251
Feature activation+9.14e-03

Pos logit contributions
ccoli+0.001
 extingu+0.001
 pleasure+0.001
lette+0.001
ugg+0.001
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
Stop-0.001
 people-0.001
Token his
Token ID465
Feature activation+3.92e-03

Pos logit contributions
 other+0.011
L+0.009
 things+0.009
Stop+0.008
Tom+0.008
Neg logit contributions
 extingu-0.023
ccoli-0.023
lette-0.022
work-0.022
 pleasure-0.022
Token owner
Token ID4870
Feature activation+2.53e-02

Pos logit contributions
 other+0.006
 things+0.005
 cars+0.005
 people+0.005
 kids+0.005
Neg logit contributions
 extingu-0.010
ccoli-0.009
lette-0.009
work-0.009
 hyd-0.009
Token saw
Token ID2497
Feature activation-1.14e-03

Pos logit contributions
 other+0.037
 things+0.033
L+0.033
 cars+0.031
Tom+0.031
Neg logit contributions
 extingu-0.058
ccoli-0.057
lette-0.056
work-0.055
 pleasure-0.053
Token him
Token ID683
Feature activation-3.25e-03

Pos logit contributions
 extingu+0.003
ccoli+0.003
lette+0.003
work+0.003
 pleasure+0.003
Neg logit contributions
 other-0.001
L-0.001
Tom-0.001
Tim-0.001
 things-0.001
Token and
Token ID290
Feature activation+8.20e-03

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.009
work+0.009
 pleasure+0.008
Neg logit contributions
 other-0.004
 things-0.003
 cars-0.003
 people-0.003
L-0.003
Token ran
Token ID4966
Feature activation+4.10e-03

Pos logit contributions
 other+0.012
L+0.011
 things+0.010
 cars+0.010
Tom+0.010
Neg logit contributions
 extingu-0.019
ccoli-0.019
lette-0.019
work-0.018
 pleasure-0.018
Token over
Token ID625
Feature activation-1.80e-03

Pos logit contributions
 other+0.008
L+0.007
 things+0.007
Tom+0.007
Stop+0.007
Neg logit contributions
 extingu-0.008
work-0.007
lette-0.007
ccoli-0.007
 pleasure-0.007
Token.
Token ID13
Feature activation+4.27e-04

Pos logit contributions
ccoli+0.003
 extingu+0.003
lette+0.003
 pleasure+0.003
work+0.003
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 people-0.001
Token She
Token ID1375
Feature activation+5.44e-03

Pos logit contributions
 other+0.001
L+0.000
 things+0.000
 cars+0.000
Tom+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token played
Token ID2826
Feature activation+4.90e-03

Pos logit contributions
 other+0.009
L+0.008
 cars+0.008
 she+0.007
 things+0.007
Neg logit contributions
 extingu-0.013
 pleasure-0.012
ccoli-0.012
lette-0.012
urger-0.012
Token around
Token ID1088
Feature activation+3.37e-03

Pos logit contributions
 other+0.006
L+0.005
 things+0.005
Tom+0.005
 cars+0.005
Neg logit contributions
 extingu-0.013
ccoli-0.013
lette-0.012
work-0.012
 pleasure-0.012
Token for
Token ID329
Feature activation+2.08e-03

Pos logit contributions
 other+0.003
L+0.003
Tom+0.003
Stop+0.002
Tim+0.002
Neg logit contributions
 extingu-0.009
ccoli-0.009
work-0.009
lette-0.009
 pleasure-0.009
Token hours
Token ID2250
Feature activation+2.53e-02

Pos logit contributions
 other+0.003
L+0.003
Stop+0.003
Tom+0.003
Tim+0.003
Neg logit contributions
 extingu-0.004
ccoli-0.004
 pleasure-0.004
lette-0.004
work-0.004
Token,
Token ID11
Feature activation+5.79e-04

Pos logit contributions
 other+0.029
L+0.024
 things+0.024
Stop+0.023
Tom+0.022
Neg logit contributions
 extingu-0.067
ccoli-0.065
lette-0.065
work-0.064
 pleasure-0.062
Token and
Token ID290
Feature activation+3.10e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
Tom+0.001
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token she
Token ID673
Feature activation+9.76e-03

Pos logit contributions
 other+0.004
L+0.004
Stop+0.003
 things+0.003
Tom+0.003
Neg logit contributions
ccoli-0.007
 extingu-0.007
lette-0.007
work-0.007
 pleasure-0.007
Token made
Token ID925
Feature activation-7.24e-03

Pos logit contributions
 other+0.012
L+0.010
 things+0.009
 cars+0.009
 people+0.009
Neg logit contributions
 extingu-0.027
ccoli-0.026
lette-0.025
 pleasure-0.025
work-0.025
Token a
Token ID257
Feature activation-2.71e-03

Pos logit contributions
 extingu+0.017
ccoli+0.017
lette+0.017
work+0.016
 pleasure+0.016
Neg logit contributions
 other-0.010
L-0.009
 things-0.008
Tom-0.008
Stop-0.008
Token.
Token ID13
Feature activation-5.59e-03

Pos logit contributions
 extingu+0.008
work+0.007
ccoli+0.007
lette+0.007
 hyd+0.007
Neg logit contributions
 other-0.005
Tom-0.004
Tim-0.004
Stop-0.004
L-0.004
Token She
Token ID1375
Feature activation-1.90e-03

Pos logit contributions
 extingu+0.016
ccoli+0.016
lette+0.015
work+0.015
 pleasure+0.015
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
 cars-0.004
Tom-0.004
Token had
Token ID550
Feature activation-5.23e-03

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.003
L-0.002
 things-0.002
 cars-0.002
Tom-0.002
Token not
Token ID407
Feature activation-8.67e-03

Pos logit contributions
 extingu+0.013
ccoli+0.013
lette+0.012
work+0.012
 pleasure+0.012
Neg logit contributions
 other-0.008
L-0.007
 things-0.007
 cars-0.006
 she-0.006
Token asked
Token ID1965
Feature activation-3.69e-05

Pos logit contributions
 extingu+0.021
ccoli+0.020
icol+0.019
work+0.019
acket+0.018
Neg logit contributions
 other-0.015
 things-0.013
 people-0.013
L-0.012
 she-0.012
Token permission
Token ID7170
Feature activation+2.53e-02

Pos logit contributions
 extingu+0.000
ccoli+0.000
lette+0.000
work+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
Tom-0.000
Stop-0.000
Tim-0.000
Token before
Token ID878
Feature activation+3.36e-03

Pos logit contributions
 other+0.036
L+0.036
Stop+0.035
Tim+0.034
Tom+0.034
Neg logit contributions
ccoli-0.054
lette-0.054
work-0.054
 extingu-0.052
urger-0.049
Token taking
Token ID2263
Feature activation-1.22e-02

Pos logit contributions
 other+0.004
L+0.004
Stop+0.004
Tim+0.004
Tom+0.004
Neg logit contributions
 extingu-0.008
ccoli-0.008
lette-0.008
work-0.007
 crop-0.007
Token it
Token ID340
Feature activation-2.59e-03

Pos logit contributions
 extingu+0.033
ccoli+0.032
lette+0.032
work+0.031
 pleasure+0.031
Neg logit contributions
 other-0.014
L-0.012
Tom-0.011
 things-0.011
Tim-0.011
Token.
Token ID13
Feature activation-3.62e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.006
Neg logit contributions
 other-0.003
L-0.003
 things-0.003
Tom-0.002
 cars-0.002
Token She
Token ID1375
Feature activation+5.68e-04

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.009
work+0.009
 pleasure+0.008
Neg logit contributions
 other-0.005
L-0.004
 things-0.004
 cars-0.004
Tom-0.004
Token.
Token ID13
Feature activation+1.62e-03

Pos logit contributions
 other+0.015
L+0.013
 things+0.012
Stop+0.012
Tom+0.012
Neg logit contributions
 extingu-0.037
ccoli-0.036
work-0.035
lette-0.035
 pleasure-0.033
Token
Token ID198
Feature activation-4.11e-04

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
 extingu-0.004
ccoli-0.004
work-0.004
lette-0.004
 pleasure-0.004
Token
Token ID198
Feature activation-7.21e-04

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.001
 things-0.000
 cars-0.000
Stop-0.000
 balls-0.000
TokenThe
Token ID464
Feature activation-6.75e-04

Pos logit contributions
 extingu+0.002
 hyd+0.002
lette+0.002
ccoli+0.002
work+0.002
Neg logit contributions
 other-0.001
 things-0.001
 cars-0.001
 she-0.001
 balls-0.001
Token shop
Token ID6128
Feature activation+7.64e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
work+0.001
 hyd+0.001
 pleasure+0.001
Neg logit contributions
Stop-0.001
 Lia-0.001
Fine-0.001
 them-0.001
 other-0.001
Token owner
Token ID4870
Feature activation+2.52e-02

Pos logit contributions
 other+0.012
L+0.011
Tim+0.010
Stop+0.010
Tom+0.010
Neg logit contributions
 extingu-0.017
work-0.016
ccoli-0.016
lette-0.016
 hyd-0.015
Token hears
Token ID23180
Feature activation+5.88e-03

Pos logit contributions
 other+0.036
L+0.032
Tim+0.031
 things+0.030
Tom+0.030
Neg logit contributions
 extingu-0.058
ccoli-0.054
work-0.054
lette-0.053
 hyd-0.049
Token the
Token ID262
Feature activation+1.12e-03

Pos logit contributions
 other+0.006
Fine+0.006
L+0.006
Sp+0.005
 cars+0.005
Neg logit contributions
 extingu-0.016
work-0.013
ccoli-0.013
lette-0.013
 pleasure-0.013
Token noise
Token ID7838
Feature activation-6.26e-03

Pos logit contributions
 other+0.001
L+0.001
 things+0.001
Stop+0.001
Tom+0.001
Neg logit contributions
 extingu-0.003
work-0.003
ccoli-0.003
lette-0.003
 pleasure-0.003
Token and
Token ID290
Feature activation+1.40e-02

Pos logit contributions
 extingu+0.013
work+0.013
lette+0.012
ccoli+0.012
 hyd+0.012
Neg logit contributions
 other-0.010
L-0.009
Stop-0.009
Tim-0.008
 things-0.008
Token comes
Token ID2058
Feature activation+1.61e-02

Pos logit contributions
 other+0.022
L+0.020
Tom+0.019
Tim+0.019
 things+0.018
Neg logit contributions
 extingu-0.031
ccoli-0.028
work-0.027
icol-0.026
lette-0.026
Token
Token ID198
Feature activation+1.34e-04

Pos logit contributions
ccoli+0.001
 extingu+0.001
 pleasure+0.001
lette+0.001
ugg+0.001
Neg logit contributions
 other-0.001
 things-0.000
L-0.000
Tim-0.000
 cars-0.000
Token
Token ID198
Feature activation+2.79e-03

Pos logit contributions
 things+0.000
 other+0.000
 cars+0.000
 balls+0.000
 people+0.000
Neg logit contributions
ccoli-0.000
ugg-0.000
 extingu-0.000
ossom-0.000
 pleasure-0.000
TokenFinally
Token ID11158
Feature activation+1.09e-03

Pos logit contributions
 cars+0.004
 other+0.004
 things+0.004
 bridges+0.003
 snakes+0.003
Neg logit contributions
ugg-0.005
ossom-0.005
lette-0.005
ccoli-0.005
moon-0.005
Token,
Token ID11
Feature activation-4.85e-03

Pos logit contributions
 other+0.002
Stop+0.001
Tim+0.001
 cars+0.001
Tom+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
 pleasure-0.002
 crop-0.002
lette-0.002
Token her
Token ID607
Feature activation+5.08e-03

Pos logit contributions
 extingu+0.011
ccoli+0.010
 crop+0.010
work+0.010
lette+0.010
Neg logit contributions
 other-0.007
 cars-0.006
Tim-0.006
Tom-0.006
L-0.006
Token owner
Token ID4870
Feature activation+2.51e-02

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
 cars+0.006
Tom+0.006
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.012
work-0.012
 pleasure-0.012
Token walked
Token ID6807
Feature activation+2.90e-03

Pos logit contributions
 other+0.036
L+0.031
 things+0.029
Tom+0.029
 people+0.028
Neg logit contributions
 extingu-0.062
ccoli-0.061
lette-0.060
work-0.059
 pleasure-0.058
Token up
Token ID510
Feature activation+5.35e-03

Pos logit contributions
 other+0.005
L+0.005
Tom+0.004
 things+0.004
 cars+0.004
Neg logit contributions
 extingu-0.006
work-0.005
lette-0.005
ccoli-0.005
 hyd-0.005
Token to
Token ID284
Feature activation-7.99e-04

Pos logit contributions
 other+0.008
L+0.007
 things+0.007
 cars+0.006
Tom+0.006
Neg logit contributions
 extingu-0.013
ccoli-0.013
lette-0.012
work-0.012
 pleasure-0.012
Token Sprite
Token ID33132
Feature activation+4.66e-03

Pos logit contributions
 extingu+0.002
ccoli+0.002
 pleasure+0.002
lette+0.002
work+0.002
Neg logit contributions
 other-0.001
 cars-0.001
 things-0.001
L-0.001
Tim-0.001
Token and
Token ID290
Feature activation+5.64e-03

Pos logit contributions
 other+0.005
 things+0.003
L+0.003
 cars+0.003
 people+0.003
Neg logit contributions
 extingu-0.014
ccoli-0.013
lette-0.013
work-0.013
 pleasure-0.013
Token
Token ID198
Feature activation+2.67e-03

Pos logit contributions
 things+0.001
 other+0.001
Stop+0.001
 cars+0.001
 balls+0.001
Neg logit contributions
ccoli-0.001
 extingu-0.001
 pleasure-0.001
lette-0.001
ugg-0.001
TokenFinally
Token ID11158
Feature activation+3.69e-04

Pos logit contributions
 other+0.004
 things+0.004
 cars+0.003
 kids+0.003
 Lia+0.003
Neg logit contributions
ccoli-0.006
lette-0.005
 extingu-0.005
ossom-0.005
ugg-0.005
Token,
Token ID11
Feature activation-4.33e-03

Pos logit contributions
 other+0.001
Stop+0.001
L+0.000
Tom+0.000
Tim+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
work-0.001
 pleasure-0.001
 crop-0.001
Token Max
Token ID5436
Feature activation+3.45e-03

Pos logit contributions
ccoli+0.011
 extingu+0.011
work+0.010
 pleasure+0.010
 hyd+0.010
Neg logit contributions
 other-0.005
L-0.004
Stop-0.004
Fine-0.004
Tom-0.004
Token's
Token ID338
Feature activation+8.18e-03

Pos logit contributions
 other+0.005
L+0.004
Tom+0.004
 things+0.004
 people+0.004
Neg logit contributions
 extingu-0.009
ccoli-0.009
work-0.009
 pleasure-0.008
lette-0.008
Token owner
Token ID4870
Feature activation+2.50e-02

Pos logit contributions
 other+0.011
 things+0.009
L+0.008
 cars+0.008
 people+0.008
Neg logit contributions
 extingu-0.021
ccoli-0.021
work-0.021
lette-0.021
 pleasure-0.020
Token came
Token ID1625
Feature activation+7.24e-03

Pos logit contributions
 other+0.037
L+0.032
 things+0.032
Tom+0.030
 cars+0.030
Neg logit contributions
 extingu-0.059
ccoli-0.058
lette-0.057
work-0.056
 pleasure-0.055
Token and
Token ID290
Feature activation+1.09e-02

Pos logit contributions
 other+0.011
 things+0.009
L+0.009
 cars+0.009
 people+0.009
Neg logit contributions
ccoli-0.017
 extingu-0.017
lette-0.016
work-0.016
 pleasure-0.016
Token opened
Token ID4721
Feature activation+4.61e-03

Pos logit contributions
 other+0.016
L+0.015
 she+0.014
 things+0.013
 Lily+0.013
Neg logit contributions
lette-0.026
 extingu-0.026
 pleasure-0.026
ccoli-0.026
work-0.026
Token the
Token ID262
Feature activation+5.10e-03

Pos logit contributions
 other+0.003
 things+0.002
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
 extingu-0.015
ccoli-0.015
lette-0.014
work-0.014
 pleasure-0.014
Token door
Token ID3420
Feature activation-7.95e-03

Pos logit contributions
 other+0.006
 things+0.005
L+0.005
 cars+0.005
 people+0.005
Neg logit contributions
 extingu-0.013
ccoli-0.013
lette-0.013
work-0.013
 pleasure-0.012
Token means
Token ID1724
Feature activation+3.65e-03

Pos logit contributions
 other+0.004
 things+0.004
L+0.003
 people+0.003
 cars+0.003
Neg logit contributions
 extingu-0.011
ccoli-0.010
lette-0.010
work-0.010
ossom-0.010
Token that
Token ID326
Feature activation-1.58e-03

Pos logit contributions
 other+0.004
L+0.004
Tom+0.004
Stop+0.003
Tim+0.003
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.009
work-0.009
 pleasure-0.009
Token Mitt
Token ID16627
Feature activation-3.74e-03

Pos logit contributions
ccoli+0.005
 extingu+0.005
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.001
L-0.001
Tom-0.001
Tim-0.001
Stop-0.001
Tokenens
Token ID641
Feature activation+8.46e-03

Pos logit contributions
ccoli+0.009
 extingu+0.008
lette+0.008
work+0.008
ugg+0.007
Neg logit contributions
 other-0.006
 cars-0.005
 things-0.005
Tom-0.005
L-0.005
Token'
Token ID6
Feature activation+1.18e-03

Pos logit contributions
 other+0.010
 things+0.008
 cars+0.008
 people+0.008
L+0.008
Neg logit contributions
 extingu-0.024
lette-0.022
ccoli-0.022
work-0.022
 pleasure-0.022
Token owner
Token ID4870
Feature activation+2.48e-02

Pos logit contributions
 other+0.002
Tom+0.002
L+0.002
Stop+0.001
Tim+0.001
Neg logit contributions
 extingu-0.003
lette-0.003
ccoli-0.003
 pleasure-0.002
work-0.002
Token could
Token ID714
Feature activation-2.90e-04

Pos logit contributions
 other+0.037
 things+0.030
L+0.030
 cars+0.029
 people+0.029
Neg logit contributions
 extingu-0.063
lette-0.059
ccoli-0.059
work-0.057
icol-0.057
Token always
Token ID1464
Feature activation+2.61e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
 cars-0.000
Tom-0.000
Token count
Token ID954
Feature activation+9.49e-04

Pos logit contributions
 other+0.004
 things+0.003
 me+0.003
 people+0.003
L+0.003
Neg logit contributions
 extingu-0.006
ccoli-0.006
work-0.006
lette-0.006
icol-0.006
Token on
Token ID319
Feature activation-3.08e-03

Pos logit contributions
 other+0.001
L+0.001
Tom+0.001
Stop+0.001
Tim+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
work-0.002
lette-0.002
 pleasure-0.002
Token her
Token ID607
Feature activation-1.30e-03

Pos logit contributions
 extingu+0.008
ccoli+0.008
lette+0.008
work+0.008
 pleasure+0.008
Neg logit contributions
 other-0.003
L-0.003
Tom-0.002
Stop-0.002
Tim-0.002
Token and
Token ID290
Feature activation+4.61e-03

Pos logit contributions
 extingu+0.010
ccoli+0.010
work+0.009
lette+0.009
 pleasure+0.009
Neg logit contributions
 other-0.005
L-0.004
Tom-0.004
 things-0.004
Stop-0.004
Token bark
Token ID21405
Feature activation+7.00e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
Tom+0.006
 cars+0.006
Neg logit contributions
ccoli-0.011
 extingu-0.010
work-0.010
lette-0.010
 pleasure-0.010
Tokened
Token ID276
Feature activation-1.69e-03

Pos logit contributions
 other+0.015
L+0.014
 things+0.014
 cars+0.013
Tom+0.013
Neg logit contributions
 extingu-0.012
ccoli-0.011
lette-0.011
work-0.011
 pleasure-0.010
Token until
Token ID1566
Feature activation+2.52e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.002
L-0.002
Tom-0.002
 things-0.002
 cars-0.002
Token his
Token ID465
Feature activation-8.77e-06

Pos logit contributions
 other+0.002
L+0.002
Stop+0.002
 things+0.002
Tom+0.002
Neg logit contributions
ccoli-0.007
 extingu-0.007
work-0.007
lette-0.007
 pleasure-0.007
Token owner
Token ID4870
Feature activation+2.48e-02

Feature not active on prev token;
no predictions were affected.
Token gave
Token ID2921
Feature activation+9.13e-04

Pos logit contributions
 other+0.040
L+0.035
 things+0.035
 cars+0.033
Tom+0.033
Neg logit contributions
 extingu-0.055
ccoli-0.054
lette-0.053
work-0.052
 pleasure-0.050
Token him
Token ID683
Feature activation-4.85e-03

Pos logit contributions
 other+0.001
L+0.001
Tom+0.001
 Lia+0.001
Tim+0.001
Neg logit contributions
ccoli-0.002
 extingu-0.002
 pleasure-0.002
work-0.002
lette-0.002
Token a
Token ID257
Feature activation+7.32e-03

Pos logit contributions
 extingu+0.014
ccoli+0.014
lette+0.014
work+0.014
icol+0.013
Neg logit contributions
 other-0.005
 things-0.004
L-0.004
 cars-0.004
 people-0.004
Token treat
Token ID2190
Feature activation+1.44e-02

Pos logit contributions
 other+0.010
 things+0.007
L+0.007
 Lily+0.007
 lot+0.007
Neg logit contributions
lette-0.020
work-0.020
 extingu-0.019
ccoli-0.019
ci-0.019
Token to
Token ID284
Feature activation+5.80e-03

Pos logit contributions
 other+0.013
 things+0.009
L+0.009
 cars+0.009
 she+0.008
Neg logit contributions
 extingu-0.043
ccoli-0.042
lette-0.042
 pleasure-0.041
ossom-0.041
Token When
Token ID1649
Feature activation-7.33e-03

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
work-0.005
 pleasure-0.005
Token Daisy
Token ID40355
Feature activation-2.15e-03

Pos logit contributions
 extingu+0.024
ccoli+0.023
lette+0.023
work+0.023
 pleasure+0.022
Neg logit contributions
 other-0.005
 things-0.003
L-0.003
 cars-0.003
 people-0.003
Token put
Token ID1234
Feature activation-6.91e-03

Pos logit contributions
 extingu+0.006
lette+0.006
 pleasure+0.006
ccoli+0.006
work+0.006
Neg logit contributions
 other-0.003
 she-0.002
 me-0.002
L-0.002
 Lily-0.002
Token the
Token ID262
Feature activation+6.06e-04

Pos logit contributions
 extingu+0.019
ccoli+0.019
lette+0.019
work+0.018
 pleasure+0.018
Neg logit contributions
 other-0.007
L-0.006
Tom-0.005
 things-0.005
Tim-0.005
Token magn
Token ID7842
Feature activation-1.51e-02

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
 people+0.001
 kids+0.001
Neg logit contributions
ccoli-0.001
 extingu-0.001
lette-0.001
work-0.001
ossom-0.001
Tokenifying
Token ID4035
Feature activation-7.27e-02

Pos logit contributions
ccoli+0.022
 extingu+0.022
lette+0.020
 pleasure+0.020
 crop+0.020
Neg logit contributions
 other-0.035
L-0.032
 things-0.032
 cars-0.032
 she-0.031
Token glass
Token ID5405
Feature activation-8.59e-03

Pos logit contributions
ccoli+0.163
lette+0.157
 extingu+0.156
urger+0.153
 pleasure+0.144
Neg logit contributions
 other-0.125
L-0.107
 me-0.107
 Lily-0.105
 things-0.105
Token down
Token ID866
Feature activation+3.64e-03

Pos logit contributions
 extingu+0.024
ccoli+0.024
lette+0.023
 pleasure+0.023
work+0.023
Neg logit contributions
 other-0.009
 things-0.008
L-0.007
 cars-0.007
 she-0.007
Token,
Token ID11
Feature activation+5.67e-03

Pos logit contributions
 other+0.004
L+0.003
Tom+0.003
 things+0.003
 cars+0.003
Neg logit contributions
 extingu-0.010
ccoli-0.010
work-0.009
lette-0.009
 pleasure-0.009
Token the
Token ID262
Feature activation-6.52e-04

Pos logit contributions
 other+0.003
L+0.002
Tom+0.002
 cars+0.002
 things+0.002
Neg logit contributions
 extingu-0.018
ccoli-0.018
lette-0.017
work-0.017
 pleasure-0.017
Token cater
Token ID20825
Feature activation-1.18e-02

Pos logit contributions
 extingu+0.002
ccoli+0.002
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 people-0.001
Token flashlight
Token ID38371
Feature activation-7.64e-03

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.005
work+0.005
ugg+0.005
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.003
 people-0.003
Token,
Token ID11
Feature activation-9.50e-05

Pos logit contributions
ccoli+0.022
 extingu+0.022
lette+0.021
 pleasure+0.021
ossom+0.021
Neg logit contributions
 other-0.008
 things-0.006
 cars-0.006
 she-0.006
 people-0.006
Token and
Token ID290
Feature activation+6.92e-04

Pos logit contributions
 extingu+0.000
ccoli+0.000
work+0.000
lette+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
 things-0.000
 cars-0.000
L-0.000
 people-0.000
Token a
Token ID257
Feature activation-6.18e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token magn
Token ID7842
Feature activation-8.91e-03

Pos logit contributions
 extingu+0.014
ccoli+0.014
lette+0.013
work+0.013
ugg+0.013
Neg logit contributions
 other-0.010
 things-0.009
 cars-0.009
L-0.009
 people-0.008
Tokenifying
Token ID4035
Feature activation-6.30e-02

Pos logit contributions
ccoli+0.013
 extingu+0.013
lette+0.012
 pleasure+0.012
 crop+0.012
Neg logit contributions
 other-0.020
 things-0.019
 cars-0.018
L-0.018
 she-0.018
Token glass
Token ID5405
Feature activation-3.74e-03

Pos logit contributions
ccoli+0.149
lette+0.142
 extingu+0.138
urger+0.137
work+0.128
Neg logit contributions
 other-0.104
 me-0.089
 things-0.088
L-0.088
 Lily-0.087
Token.
Token ID13
Feature activation+1.76e-03

Pos logit contributions
 extingu+0.011
ccoli+0.010
lette+0.010
 pleasure+0.010
work+0.010
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.003
 people-0.003
Token
Token ID198
Feature activation+5.51e-03

Pos logit contributions
 other+0.002
 things+0.002
L+0.001
 cars+0.001
Stop+0.001
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
work-0.005
 pleasure-0.005
Token
Token ID198
Feature activation+6.63e-03

Pos logit contributions
 other+0.008
 things+0.007
 cars+0.007
 balls+0.006
Stop+0.006
Neg logit contributions
ccoli-0.012
 extingu-0.012
lette-0.012
 pleasure-0.012
ossom-0.012
TokenAs
Token ID1722
Feature activation-5.12e-03

Pos logit contributions
 other+0.007
 things+0.006
 cars+0.006
 people+0.005
 balls+0.005
Neg logit contributions
lette-0.018
ccoli-0.017
 extingu-0.017
ossom-0.017
ugg-0.016
Token She
Token ID1375
Feature activation+6.31e-03

Pos logit contributions
 other+0.002
 things+0.001
L+0.001
Stop+0.001
 cars+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.003
work-0.003
 pleasure-0.003
Token takes
Token ID2753
Feature activation-2.34e-03

Pos logit contributions
 other+0.009
 things+0.008
 cars+0.007
L+0.007
 people+0.007
Neg logit contributions
lette-0.016
 extingu-0.016
work-0.015
ccoli-0.015
 pleasure-0.015
Token out
Token ID503
Feature activation-5.06e-03

Pos logit contributions
 extingu+0.006
ccoli+0.006
lette+0.006
work+0.005
 pleasure+0.005
Neg logit contributions
 other-0.003
 things-0.003
L-0.003
 cars-0.003
Tom-0.003
Token a
Token ID257
Feature activation+3.47e-03

Pos logit contributions
 extingu+0.013
ccoli+0.013
lette+0.013
work+0.013
 pleasure+0.012
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
Tom-0.005
 cars-0.005
Token magn
Token ID7842
Feature activation-7.52e-03

Pos logit contributions
 other+0.006
 things+0.005
L+0.005
 cars+0.004
 people+0.004
Neg logit contributions
 extingu-0.008
ccoli-0.008
lette-0.008
work-0.008
 pleasure-0.007
Tokenifying
Token ID4035
Feature activation-6.23e-02

Pos logit contributions
ccoli+0.011
 extingu+0.010
lette+0.010
 pleasure+0.010
 crop+0.009
Neg logit contributions
 other-0.018
L-0.016
 things-0.016
 cars-0.016
Stop-0.016
Token glass
Token ID5405
Feature activation-3.69e-03

Pos logit contributions
ccoli+0.142
lette+0.140
urger+0.134
 extingu+0.133
 pleasure+0.126
Neg logit contributions
 other-0.105
L-0.091
 me-0.091
 Lily-0.090
 things-0.089
Token.
Token ID13
Feature activation+4.52e-03

Pos logit contributions
ccoli+0.010
 extingu+0.010
lette+0.010
 pleasure+0.010
work+0.009
Neg logit contributions
 other-0.004
 things-0.004
L-0.004
 cars-0.003
 people-0.003
Token She
Token ID1375
Feature activation+7.12e-03

Pos logit contributions
 other+0.005
 things+0.004
L+0.004
 cars+0.004
 people+0.004
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.011
work-0.011
 pleasure-0.011
Token looks
Token ID3073
Feature activation-2.07e-03

Pos logit contributions
 other+0.010
 things+0.009
 cars+0.008
 me+0.008
L+0.008
Neg logit contributions
lette-0.018
 extingu-0.018
work-0.017
ccoli-0.017
 pleasure-0.017
Token at
Token ID379
Feature activation+5.15e-03

Pos logit contributions
 extingu+0.006
lette+0.006
ccoli+0.006
work+0.006
ugg+0.006
Neg logit contributions
 other-0.002
 things-0.002
 she-0.002
 Lily-0.002
 people-0.002
Token,
Token ID11
Feature activation+7.95e-04

Pos logit contributions
 extingu+0.009
ccoli+0.008
lette+0.008
work+0.008
 pleasure+0.008
Neg logit contributions
 other-0.003
 things-0.002
L-0.002
 she-0.002
 people-0.002
Token handing
Token ID22786
Feature activation-1.66e-03

Pos logit contributions
 other+0.001
L+0.000
 she+0.000
Tom+0.000
 Lily+0.000
Neg logit contributions
 extingu-0.002
lette-0.002
ccoli-0.002
work-0.002
 pleasure-0.002
Token him
Token ID683
Feature activation-4.97e-03

Pos logit contributions
 extingu+0.005
ccoli+0.004
work+0.004
lette+0.004
 pleasure+0.004
Neg logit contributions
 other-0.002
L-0.001
 things-0.001
Tom-0.001
 cars-0.001
Token the
Token ID262
Feature activation+2.07e-03

Pos logit contributions
 extingu+0.015
ccoli+0.015
lette+0.015
work+0.015
 pleasure+0.014
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.003
 people-0.003
Token magn
Token ID7842
Feature activation-8.88e-03

Pos logit contributions
 other+0.004
 things+0.003
 cars+0.003
 people+0.003
 kids+0.003
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
icol-0.005
ossom-0.005
Tokenifying
Token ID4035
Feature activation-6.01e-02

Pos logit contributions
 extingu+0.012
ccoli+0.012
lette+0.011
 pleasure+0.011
 crop+0.011
Neg logit contributions
 other-0.021
L-0.019
 cars-0.019
 things-0.019
Tom-0.018
Token glass
Token ID5405
Feature activation-5.57e-04

Pos logit contributions
ccoli+0.127
lette+0.127
 extingu+0.123
urger+0.119
work+0.114
Neg logit contributions
 other-0.110
L-0.095
 things-0.094
 Lily-0.093
 me-0.091
Token.
Token ID13
Feature activation+3.71e-03

Pos logit contributions
ccoli+0.002
 extingu+0.002
lette+0.002
 pleasure+0.002
work+0.001
Neg logit contributions
 other-0.001
 things-0.000
L-0.000
 cars-0.000
 people-0.000
Token She
Token ID1375
Feature activation+2.90e-03

Pos logit contributions
 other+0.005
 things+0.005
 people+0.004
Stop+0.004
 cars+0.004
Neg logit contributions
ccoli-0.009
 extingu-0.009
 pleasure-0.008
icol-0.008
ugg-0.008
Token looked
Token ID3114
Feature activation-2.62e-03

Pos logit contributions
 other+0.004
L+0.003
Tom+0.003
 cars+0.003
 things+0.003
Neg logit contributions
 extingu-0.008
lette-0.007
work-0.007
ccoli-0.007
 pleasure-0.007
Token around
Token ID1088
Feature activation-1.25e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.003
 things-0.003
L-0.002
 cars-0.002
 people-0.002
Token He
Token ID679
Feature activation+2.16e-03

Pos logit contributions
 extingu+0.002
ccoli+0.002
work+0.002
lette+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
L-0.001
 things-0.001
 cars-0.001
Stop-0.001
Token took
Token ID1718
Feature activation-2.75e-03

Pos logit contributions
 other+0.003
L+0.002
 cars+0.002
 things+0.002
Tom+0.002
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
 pleasure-0.006
work-0.005
Token out
Token ID503
Feature activation-6.10e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.006
Neg logit contributions
 other-0.004
L-0.003
 things-0.003
Tom-0.003
 cars-0.003
Token a
Token ID257
Feature activation+4.66e-03

Pos logit contributions
 extingu+0.017
ccoli+0.017
lette+0.016
work+0.016
 pleasure+0.016
Neg logit contributions
 other-0.007
L-0.005
 things-0.005
Tom-0.005
 cars-0.005
Token magn
Token ID7842
Feature activation-6.95e-03

Pos logit contributions
 other+0.008
 things+0.007
L+0.006
 cars+0.006
 people+0.006
Neg logit contributions
 extingu-0.011
ccoli-0.010
lette-0.010
work-0.010
 pleasure-0.010
Tokenifying
Token ID4035
Feature activation-5.71e-02

Pos logit contributions
ccoli+0.010
 extingu+0.010
lette+0.010
 pleasure+0.009
 crop+0.009
Neg logit contributions
 other-0.016
L-0.014
 cars-0.014
 things-0.014
Stop-0.014
Token glass
Token ID5405
Feature activation-3.25e-04

Pos logit contributions
ccoli+0.132
lette+0.131
 extingu+0.123
urger+0.123
 pleasure+0.115
Neg logit contributions
 other-0.095
L-0.085
 me-0.082
 Lily-0.081
 things-0.080
Token and
Token ID290
Feature activation+4.64e-03

Pos logit contributions
ccoli+0.001
lette+0.001
 extingu+0.001
 pleasure+0.001
work+0.001
Neg logit contributions
 other-0.000
 things-0.000
L-0.000
 cars-0.000
 she-0.000
Token a
Token ID257
Feature activation-3.42e-03

Pos logit contributions
 other+0.007
 things+0.006
 she+0.006
 people+0.006
 cars+0.006
Neg logit contributions
 extingu-0.011
ccoli-0.011
lette-0.011
 pleasure-0.010
work-0.010
Token notebook
Token ID20922
Feature activation-1.05e-03

Pos logit contributions
 extingu+0.008
ccoli+0.007
lette+0.007
work+0.007
ugg+0.007
Neg logit contributions
 other-0.006
 things-0.005
L-0.005
 cars-0.005
 people-0.005
Token and
Token ID290
Feature activation+3.07e-03

Pos logit contributions
 extingu+0.003
ccoli+0.003
lette+0.003
 pleasure+0.003
work+0.003
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 she-0.001
Token
Token ID198
Feature activation+1.75e-04

Pos logit contributions
 extingu+0.006
ccoli+0.006
lette+0.006
 pleasure+0.006
ossom+0.006
Neg logit contributions
 other-0.004
 things-0.004
 cars-0.003
Stop-0.003
 balls-0.003
TokenHe
Token ID1544
Feature activation-6.61e-04

Pos logit contributions
 other+0.000
 things+0.000
 cars+0.000
 kids+0.000
 Lily+0.000
Neg logit contributions
lette-0.000
ccoli-0.000
 extingu-0.000
ossom-0.000
 crop-0.000
Token brought
Token ID3181
Feature activation-7.05e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 crop+0.001
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
Tom-0.001
 cars-0.001
Token his
Token ID465
Feature activation-4.78e-03

Pos logit contributions
ccoli+0.021
 extingu+0.021
lette+0.020
work+0.020
icol+0.020
Neg logit contributions
 other-0.007
 things-0.006
 people-0.005
 cars-0.005
L-0.005
Token magn
Token ID7842
Feature activation-6.88e-03

Pos logit contributions
 extingu+0.012
ccoli+0.012
lette+0.011
icol+0.011
ugg+0.011
Neg logit contributions
 other-0.008
 things-0.007
 cars-0.006
 people-0.006
 kids-0.006
Tokenifying
Token ID4035
Feature activation-5.63e-02

Pos logit contributions
ccoli+0.011
 extingu+0.010
 pleasure+0.010
lette+0.009
 crop+0.009
Neg logit contributions
 other-0.015
L-0.014
 cars-0.014
 things-0.014
 me-0.013
Token glass
Token ID5405
Feature activation-2.96e-03

Pos logit contributions
ccoli+0.126
lette+0.125
 extingu+0.119
urger+0.118
work+0.116
Neg logit contributions
 other-0.098
L-0.084
 things-0.082
 Lily-0.082
 me-0.080
Token and
Token ID290
Feature activation+3.35e-03

Pos logit contributions
ccoli+0.008
lette+0.008
 pleasure+0.008
 extingu+0.008
icol+0.007
Neg logit contributions
 other-0.004
 things-0.003
 cars-0.003
L-0.003
 balls-0.003
Token looked
Token ID3114
Feature activation-5.19e-03

Pos logit contributions
 other+0.005
 things+0.004
 cars+0.004
 people+0.004
L+0.004
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.008
work-0.008
 pleasure-0.008
Token closely
Token ID7173
Feature activation+2.26e-03

Pos logit contributions
 extingu+0.013
lette+0.013
ccoli+0.013
work+0.013
 pleasure+0.012
Neg logit contributions
 other-0.007
 things-0.006
 cars-0.005
L-0.005
 people-0.005
Token.
Token ID13
Feature activation-4.60e-04

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
work-0.006
 pleasure-0.006
Token the
Token ID262
Feature activation-1.07e-04

Pos logit contributions
 other+0.004
L+0.004
Stop+0.004
Tom+0.004
Fine+0.004
Neg logit contributions
 extingu-0.011
work-0.010
 pleasure-0.010
ccoli-0.009
ci-0.009
Token watch
Token ID2342
Feature activation-7.49e-04

Pos logit contributions
 extingu+0.000
ccoli+0.000
lette+0.000
work+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
Tom-0.000
 cars-0.000
Token with
Token ID351
Feature activation-2.19e-03

Pos logit contributions
 extingu+0.002
ccoli+0.002
lette+0.002
work+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
L-0.001
 things-0.001
Stop-0.001
Tom-0.001
Token the
Token ID262
Feature activation+1.56e-03

Pos logit contributions
ccoli+0.006
 extingu+0.006
lette+0.006
work+0.006
ossom+0.006
Neg logit contributions
 other-0.002
 things-0.002
 cars-0.002
L-0.002
 people-0.002
Token magn
Token ID7842
Feature activation-7.42e-03

Pos logit contributions
 other+0.003
L+0.002
 things+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.003
work-0.003
 pleasure-0.003
Tokenifying
Token ID4035
Feature activation-5.61e-02

Pos logit contributions
 extingu+0.010
ccoli+0.010
 pleasure+0.009
lette+0.009
 crop+0.009
Neg logit contributions
 other-0.018
L-0.016
 cars-0.016
 things-0.016
Tom-0.016
Token glass
Token ID5405
Feature activation-1.20e-03

Pos logit contributions
ccoli+0.121
lette+0.119
 extingu+0.115
urger+0.112
 pleasure+0.107
Neg logit contributions
 other-0.102
L-0.089
 things-0.086
 Lily-0.086
 me-0.084
Token.
Token ID13
Feature activation+2.93e-03

Pos logit contributions
 extingu+0.003
ccoli+0.003
lette+0.003
work+0.003
 pleasure+0.003
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
Tom-0.001
Token He
Token ID679
Feature activation+1.41e-03

Pos logit contributions
 other+0.004
L+0.003
 things+0.003
Tom+0.003
 cars+0.003
Neg logit contributions
 extingu-0.008
ccoli-0.008
lette-0.007
work-0.007
 pleasure-0.007
Token saw
Token ID2497
Feature activation-3.93e-03

Pos logit contributions
 other+0.002
 things+0.002
 cars+0.002
Stop+0.002
L+0.002
Neg logit contributions
ccoli-0.003
 extingu-0.003
lette-0.003
work-0.003
 crop-0.003
Token the
Token ID262
Feature activation+8.53e-04

Pos logit contributions
 extingu+0.009
work+0.008
ccoli+0.008
 pleasure+0.008
lette+0.008
Neg logit contributions
L-0.005
Stop-0.005
 other-0.005
Tom-0.005
Fine-0.005
Token the
Token ID262
Feature activation+1.51e-03

Pos logit contributions
 extingu+0.015
ccoli+0.015
lette+0.014
work+0.014
 pleasure+0.014
Neg logit contributions
 other-0.005
L-0.004
Stop-0.004
Tom-0.004
Tim-0.004
Token necklace
Token ID38987
Feature activation+5.01e-03

Pos logit contributions
 other+0.003
L+0.002
 things+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.003
work-0.003
 pleasure-0.003
Token and
Token ID290
Feature activation+1.10e-04

Pos logit contributions
 other+0.005
 things+0.004
L+0.004
 she+0.004
 cars+0.004
Neg logit contributions
ccoli-0.014
 extingu-0.014
lette-0.014
 pleasure-0.014
work-0.014
Token the
Token ID262
Feature activation-7.51e-03

Pos logit contributions
 other+0.000
 things+0.000
L+0.000
 cars+0.000
 she+0.000
Neg logit contributions
 extingu-0.000
lette-0.000
ccoli-0.000
work-0.000
 pleasure-0.000
Token magn
Token ID7842
Feature activation-5.86e-03

Pos logit contributions
 extingu+0.016
ccoli+0.016
lette+0.016
work+0.015
 pleasure+0.015
Neg logit contributions
 other-0.013
L-0.011
 things-0.011
 cars-0.011
Tom-0.011
Tokenifying
Token ID4035
Feature activation-5.53e-02

Pos logit contributions
 extingu+0.008
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.015
L-0.014
 things-0.014
 cars-0.013
Tom-0.013
Token glass
Token ID5405
Feature activation+1.16e-03

Pos logit contributions
ccoli+0.117
lette+0.113
 extingu+0.112
urger+0.106
work+0.104
Neg logit contributions
 other-0.103
L-0.091
 things-0.088
 Lily-0.086
 me-0.085
Token and
Token ID290
Feature activation+2.44e-04

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.003
 pleasure-0.003
work-0.003
Token started
Token ID2067
Feature activation+1.08e-03

Pos logit contributions
 other+0.000
 things+0.000
L+0.000
 she+0.000
 people+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token to
Token ID284
Feature activation+3.06e-03

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
 she+0.001
 people+0.001
Neg logit contributions
ccoli-0.003
 extingu-0.003
lette-0.003
work-0.003
 pleasure-0.003
Token examine
Token ID10716
Feature activation-1.82e-03

Pos logit contributions
 other+0.004
 me+0.004
 she+0.004
 Lily+0.004
L+0.003
Neg logit contributions
lette-0.008
ccoli-0.008
work-0.008
 extingu-0.007
urger-0.007
Token as
Token ID355
Feature activation+7.38e-04

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
 people+0.001
L+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token he
Token ID339
Feature activation+3.91e-03

Pos logit contributions
 other+0.001
 things+0.000
L+0.000
 people+0.000
 cars+0.000
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token opened
Token ID4721
Feature activation+1.81e-03

Pos logit contributions
 other+0.006
L+0.005
 things+0.004
 cars+0.004
 people+0.004
Neg logit contributions
 extingu-0.010
ccoli-0.009
lette-0.009
work-0.009
 pleasure-0.009
Token the
Token ID262
Feature activation+3.22e-03

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
L+0.001
 people+0.001
Neg logit contributions
ccoli-0.006
 extingu-0.006
lette-0.005
work-0.005
 pleasure-0.005
Token ch
Token ID442
Feature activation-1.49e-02

Pos logit contributions
 other+0.004
 things+0.003
 cars+0.003
 people+0.003
 me+0.003
Neg logit contributions
ccoli-0.009
 extingu-0.009
lette-0.009
ossom-0.008
work-0.008
Tokenocol
Token ID4668
Feature activation-5.35e-02

Pos logit contributions
ccoli+0.030
 extingu+0.030
lette+0.028
work+0.027
 pleasure+0.026
Neg logit contributions
 other-0.028
 things-0.026
 people-0.025
 cars-0.025
L-0.024
Tokenates
Token ID689
Feature activation-5.51e-03

Pos logit contributions
 pleasure+0.079
 extingu+0.077
ccoli+0.076
icol+0.071
work+0.068
Neg logit contributions
 other-0.120
 things-0.111
 cars-0.107
Tom-0.106
L-0.106
Token and
Token ID290
Feature activation+9.19e-03

Pos logit contributions
 extingu+0.016
ccoli+0.016
lette+0.015
 pleasure+0.015
icol+0.015
Neg logit contributions
 other-0.006
 things-0.004
 cars-0.004
 she-0.004
 people-0.004
Token started
Token ID2067
Feature activation-1.38e-03

Pos logit contributions
 other+0.014
 things+0.012
 people+0.012
 cars+0.012
 she+0.012
Neg logit contributions
 extingu-0.022
ccoli-0.021
lette-0.021
 pleasure-0.020
work-0.020
Token eating
Token ID6600
Feature activation-6.21e-03

Pos logit contributions
ccoli+0.004
 extingu+0.004
lette+0.003
work+0.003
 pleasure+0.003
Neg logit contributions
 other-0.002
 things-0.002
 cars-0.002
 she-0.002
 people-0.002
Token them
Token ID606
Feature activation-3.11e-03

Pos logit contributions
 extingu+0.018
ccoli+0.018
lette+0.017
work+0.017
 pleasure+0.017
Neg logit contributions
 other-0.006
 things-0.005
 cars-0.005
L-0.005
 people-0.005
Token it
Token ID340
Feature activation+3.23e-06

Pos logit contributions
 extingu+0.014
ccoli+0.014
work+0.014
 pleasure+0.014
lette+0.014
Neg logit contributions
 other-0.004
L-0.003
Stop-0.003
Tom-0.003
 things-0.003
Token she
Token ID673
Feature activation+1.58e-02

Feature not active on prev token;
no predictions were affected.
Token had
Token ID550
Feature activation-5.20e-03

Pos logit contributions
 other+0.020
L+0.016
 things+0.016
 cars+0.015
 people+0.015
Neg logit contributions
 extingu-0.044
ccoli-0.043
lette-0.042
work-0.041
 pleasure-0.040
Token a
Token ID257
Feature activation+7.82e-04

Pos logit contributions
 extingu+0.012
ccoli+0.012
lette+0.011
work+0.011
 pleasure+0.011
Neg logit contributions
 other-0.008
L-0.007
 things-0.007
 cars-0.006
Tom-0.006
Token magn
Token ID7842
Feature activation-5.39e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Tokenifying
Token ID4035
Feature activation-5.18e-02

Pos logit contributions
ccoli+0.008
 extingu+0.008
lette+0.007
 hyd+0.007
 pleasure+0.007
Neg logit contributions
 other-0.012
L-0.011
 cars-0.011
 she-0.011
 things-0.011
Token glass
Token ID5405
Feature activation+5.92e-04

Pos logit contributions
ccoli+0.122
lette+0.118
 extingu+0.115
urger+0.113
work+0.107
Neg logit contributions
 other-0.087
L-0.075
 me-0.074
 Lily-0.073
 things-0.069
Token,
Token ID11
Feature activation+1.58e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.000
 cars+0.000
 she+0.000
Neg logit contributions
ccoli-0.002
 extingu-0.002
lette-0.002
 pleasure-0.002
work-0.002
Token a
Token ID257
Feature activation+8.33e-04

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Token tiny
Token ID7009
Feature activation+3.86e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token torch
Token ID28034
Feature activation-4.76e-03

Pos logit contributions
 other+0.006
 things+0.005
L+0.005
 cars+0.005
 people+0.005
Neg logit contributions
ccoli-0.009
 extingu-0.009
lette-0.009
work-0.008
 pleasure-0.008
Token said
Token ID531
Feature activation-9.41e-04

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.001
Tom+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Token,
Token ID11
Feature activation+2.23e-03

Pos logit contributions
ccoli+0.003
 extingu+0.003
lette+0.003
 pleasure+0.003
work+0.003
Neg logit contributions
 other-0.001
 she-0.001
 things-0.000
 Lily-0.000
L-0.000
Token holding
Token ID4769
Feature activation-1.74e-03

Pos logit contributions
 other+0.003
 she+0.003
L+0.003
 things+0.003
 people+0.003
Neg logit contributions
ccoli-0.005
 extingu-0.005
 pleasure-0.005
lette-0.005
work-0.005
Token the
Token ID262
Feature activation+2.07e-03

Pos logit contributions
 extingu+0.005
ccoli+0.004
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.002
 things-0.002
L-0.002
 cars-0.002
 people-0.002
Token magn
Token ID7842
Feature activation-4.93e-03

Pos logit contributions
 other+0.004
 things+0.003
L+0.003
 cars+0.003
 people+0.003
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Tokenifying
Token ID4035
Feature activation-5.17e-02

Pos logit contributions
ccoli+0.007
 extingu+0.007
 pleasure+0.006
lette+0.006
 crop+0.006
Neg logit contributions
 other-0.012
L-0.011
 things-0.011
 cars-0.011
Tom-0.010
Token glass
Token ID5405
Feature activation+2.88e-03

Pos logit contributions
ccoli+0.109
lette+0.107
 extingu+0.104
urger+0.101
work+0.099
Neg logit contributions
 other-0.096
L-0.084
 things-0.082
 Lily-0.080
 me-0.079
Token over
Token ID625
Feature activation-4.05e-04

Pos logit contributions
 other+0.003
 things+0.002
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
ccoli-0.008
 extingu-0.008
lette-0.008
 pleasure-0.008
work-0.008
Token a
Token ID257
Feature activation+2.90e-03

Pos logit contributions
 extingu+0.001
lette+0.001
work+0.001
 pleasure+0.001
ccoli+0.001
Neg logit contributions
 other-0.000
 cars-0.000
Stop-0.000
 things-0.000
L-0.000
Token black
Token ID2042
Feature activation+5.58e-03

Pos logit contributions
 other+0.005
L+0.004
 things+0.004
 cars+0.004
Tom+0.004
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
work-0.006
 pleasure-0.006
Token ant
Token ID1885
Feature activation+7.03e-03

Pos logit contributions
 other+0.009
L+0.008
 things+0.008
 cars+0.007
Tom+0.007
Neg logit contributions
 extingu-0.013
ccoli-0.012
lette-0.012
work-0.012
 pleasure-0.012
Token
Token ID198
Feature activation+1.91e-03

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
 balls+0.001
Stop+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
TokenHe
Token ID1544
Feature activation+2.68e-03

Pos logit contributions
 other+0.002
 things+0.002
 cars+0.002
 balls+0.002
 kids+0.002
Neg logit contributions
 extingu-0.005
lette-0.004
ccoli-0.004
work-0.004
ossom-0.004
Token pointed
Token ID6235
Feature activation+6.90e-04

Pos logit contributions
 other+0.004
L+0.004
Tom+0.003
 things+0.003
 cars+0.003
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
 pleasure-0.006
work-0.006
Token his
Token ID465
Feature activation-4.98e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token magn
Token ID7842
Feature activation-6.38e-03

Pos logit contributions
 extingu+0.011
ccoli+0.010
lette+0.010
work+0.010
 pleasure+0.010
Neg logit contributions
 other-0.009
 things-0.008
L-0.008
 cars-0.007
 people-0.007
Tokenifying
Token ID4035
Feature activation-5.13e-02

Pos logit contributions
ccoli+0.010
 extingu+0.009
 pleasure+0.009
lette+0.009
 crop+0.009
Neg logit contributions
 other-0.014
 things-0.013
L-0.013
 cars-0.013
 me-0.012
Token glass
Token ID5405
Feature activation-6.04e-04

Pos logit contributions
ccoli+0.113
lette+0.112
 extingu+0.107
urger+0.104
work+0.103
Neg logit contributions
 other-0.091
 things-0.079
L-0.078
 Lily-0.076
 me-0.074
Token at
Token ID379
Feature activation+5.05e-03

Pos logit contributions
ccoli+0.002
 extingu+0.002
lette+0.002
 pleasure+0.002
work+0.002
Neg logit contributions
 other-0.001
 things-0.001
 cars-0.001
L-0.001
 people-0.001
Token different
Token ID1180
Feature activation+6.05e-03

Pos logit contributions
 other+0.005
L+0.005
Stop+0.004
Tom+0.004
Tim+0.004
Neg logit contributions
 extingu-0.014
work-0.014
ccoli-0.013
lette-0.013
 pleasure-0.013
Token classmates
Token ID28999
Feature activation+8.68e-03

Pos logit contributions
L+0.007
 other+0.007
Tom+0.007
Tim+0.007
Stop+0.007
Neg logit contributions
 extingu-0.015
lette-0.014
work-0.014
ccoli-0.014
 pleasure-0.013
Token and
Token ID290
Feature activation+3.22e-04

Pos logit contributions
 other+0.010
L+0.008
 things+0.008
Tom+0.007
 cars+0.007
Neg logit contributions
 extingu-0.024
ccoli-0.023
lette-0.023
work-0.022
 pleasure-0.022
Token,
Token ID11
Feature activation+2.11e-03

Pos logit contributions
 other+0.002
L+0.001
 things+0.001
 cars+0.001
Tom+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Token next
Token ID1306
Feature activation-4.68e-03

Pos logit contributions
 other+0.003
 she+0.002
 things+0.002
L+0.002
 cars+0.002
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
work-0.006
 pleasure-0.005
Token to
Token ID284
Feature activation+5.33e-03

Pos logit contributions
 extingu+0.011
ccoli+0.011
lette+0.011
work+0.011
 pleasure+0.010
Neg logit contributions
 other-0.007
 things-0.006
L-0.006
 cars-0.006
Tom-0.006
Token her
Token ID607
Feature activation-1.40e-03

Pos logit contributions
 other+0.004
L+0.003
 things+0.003
Tom+0.003
Tim+0.003
Neg logit contributions
 extingu-0.016
ccoli-0.016
lette-0.015
work-0.015
 pleasure-0.015
Token magn
Token ID7842
Feature activation-4.84e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.003
acket+0.003
Neg logit contributions
 other-0.002
 things-0.002
 cars-0.002
 people-0.001
L-0.001
Tokenifying
Token ID4035
Feature activation-5.09e-02

Pos logit contributions
 extingu+0.006
ccoli+0.006
lette+0.006
 pleasure+0.006
 crop+0.005
Neg logit contributions
 other-0.012
L-0.011
 things-0.011
 cars-0.011
Tom-0.011
Token glass
Token ID5405
Feature activation-1.91e-03

Pos logit contributions
ccoli+0.111
lette+0.110
 extingu+0.106
urger+0.103
work+0.100
Neg logit contributions
 other-0.091
L-0.081
 Lily-0.079
 things-0.078
 me-0.076
Token.
Token ID13
Feature activation+1.19e-03

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.004
 pleasure+0.004
work+0.004
Neg logit contributions
 other-0.003
 things-0.002
L-0.002
 cars-0.002
 people-0.002
Token She
Token ID1375
Feature activation+5.07e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
Tom+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.003
work-0.003
 pleasure-0.003
Token looked
Token ID3114
Feature activation-3.78e-03

Pos logit contributions
 other+0.007
L+0.006
 cars+0.006
 she+0.006
 me+0.006
Neg logit contributions
 extingu-0.013
lette-0.013
work-0.013
 pleasure-0.012
ccoli-0.012
Token at
Token ID379
Feature activation+4.09e-03

Pos logit contributions
 extingu+0.010
ccoli+0.010
lette+0.010
work+0.010
 pleasure+0.009
Neg logit contributions
 other-0.005
 things-0.004
L-0.004
 cars-0.004
 people-0.004
Token She
Token ID1375
Feature activation+3.73e-03

Pos logit contributions
 extingu+0.003
ccoli+0.002
lette+0.002
work+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
L-0.001
 things-0.001
 cars-0.000
 people-0.000
Token took
Token ID1718
Feature activation-4.55e-03

Pos logit contributions
 other+0.006
L+0.005
 cars+0.005
 me+0.004
 things+0.004
Neg logit contributions
 extingu-0.009
lette-0.009
ccoli-0.009
 pleasure-0.009
work-0.009
Token out
Token ID503
Feature activation-1.04e-02

Pos logit contributions
 extingu+0.012
ccoli+0.011
lette+0.011
work+0.011
 pleasure+0.011
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
 cars-0.005
Tom-0.005
Token a
Token ID257
Feature activation+5.14e-03

Pos logit contributions
 extingu+0.029
ccoli+0.029
lette+0.029
work+0.028
 pleasure+0.028
Neg logit contributions
 other-0.011
 things-0.009
L-0.009
 cars-0.008
 people-0.008
Token magn
Token ID7842
Feature activation-4.12e-03

Pos logit contributions
 other+0.007
 things+0.006
L+0.006
 cars+0.005
 people+0.005
Neg logit contributions
lette-0.013
 extingu-0.013
ccoli-0.013
work-0.013
ugg-0.012
Tokenifying
Token ID4035
Feature activation-5.09e-02

Pos logit contributions
ccoli+0.006
 extingu+0.006
 pleasure+0.006
lette+0.006
 crop+0.006
Neg logit contributions
 other-0.009
L-0.008
 cars-0.008
 things-0.008
 me-0.008
Token glass
Token ID5405
Feature activation-1.07e-03

Pos logit contributions
ccoli+0.119
lette+0.118
urger+0.112
 extingu+0.110
 pleasure+0.104
Neg logit contributions
 other-0.086
L-0.075
 Lily-0.072
 me-0.072
 things-0.071
Token and
Token ID290
Feature activation+5.05e-03

Pos logit contributions
ccoli+0.003
lette+0.003
 extingu+0.003
 pleasure+0.003
urger+0.003
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 she-0.001
 cars-0.001
Token looked
Token ID3114
Feature activation-5.02e-03

Pos logit contributions
 other+0.008
 things+0.007
 she+0.007
 people+0.006
L+0.006
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.012
work-0.012
urger-0.011
Token very
Token ID845
Feature activation+4.13e-03

Pos logit contributions
 extingu+0.013
ccoli+0.013
lette+0.013
work+0.013
acket+0.012
Neg logit contributions
 other-0.007
 things-0.006
 people-0.005
 she-0.005
 it-0.005
Token carefully
Token ID7773
Feature activation+2.84e-03

Pos logit contributions
 other+0.006
 things+0.005
 people+0.005
L+0.005
 cars+0.005
Neg logit contributions
 extingu-0.010
ccoli-0.010
lette-0.010
work-0.010
urger-0.009
Token a
Token ID257
Feature activation-1.99e-04

Pos logit contributions
 extingu+0.015
ccoli+0.015
lette+0.014
work+0.014
 pleasure+0.014
Neg logit contributions
 other-0.008
L-0.007
 things-0.007
Tom-0.007
 cars-0.007
Token flashlight
Token ID38371
Feature activation-5.76e-03

Pos logit contributions
 extingu+0.001
lette+0.001
work+0.001
ccoli+0.001
 pleasure+0.000
Neg logit contributions
 other-0.000
 things-0.000
L-0.000
 cars-0.000
Tom-0.000
Token and
Token ID290
Feature activation-2.05e-03

Pos logit contributions
ccoli+0.017
lette+0.017
 pleasure+0.017
 extingu+0.017
ossom+0.016
Neg logit contributions
 other-0.005
 cars-0.004
 she-0.004
 things-0.004
 people-0.003
Token a
Token ID257
Feature activation-5.24e-03

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.005
work+0.005
 pleasure+0.005
Neg logit contributions
 other-0.003
 things-0.002
 cars-0.002
L-0.002
 people-0.002
Token magn
Token ID7842
Feature activation-5.83e-03

Pos logit contributions
 extingu+0.011
ccoli+0.011
lette+0.011
work+0.011
 pleasure+0.010
Neg logit contributions
 other-0.009
 things-0.008
L-0.008
 cars-0.008
 people-0.007
Tokenifying
Token ID4035
Feature activation-4.96e-02

Pos logit contributions
ccoli+0.008
 extingu+0.008
lette+0.008
 pleasure+0.008
 crop+0.007
Neg logit contributions
 other-0.014
L-0.012
 cars-0.012
 things-0.012
Tom-0.012
Token glass
Token ID5405
Feature activation+7.09e-04

Pos logit contributions
ccoli+0.109
lette+0.107
 extingu+0.106
urger+0.100
work+0.099
Neg logit contributions
 other-0.088
L-0.077
 things-0.075
 me-0.074
 cars-0.073
Token to
Token ID284
Feature activation+1.10e-02

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
L+0.001
 people+0.001
Neg logit contributions
ccoli-0.002
 extingu-0.002
lette-0.002
 pleasure-0.002
work-0.002
Token examine
Token ID10716
Feature activation+1.70e-03

Pos logit contributions
 other+0.020
L+0.017
 things+0.017
 cars+0.017
 people+0.016
Neg logit contributions
 extingu-0.023
ccoli-0.023
lette-0.022
work-0.022
 pleasure-0.021
Token their
Token ID511
Feature activation-1.71e-04

Pos logit contributions
L+0.002
 other+0.002
Tom+0.002
Stop+0.001
Tim+0.001
Neg logit contributions
 extingu-0.005
ccoli-0.004
work-0.004
lette-0.004
 crop-0.004
Token finds
Token ID7228
Feature activation-2.38e-03

Pos logit contributions
 extingu+0.000
ccoli+0.000
work+0.000
lette+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
Tom-0.000
Tim-0.000
Stop-0.000
Token borrowed
Token ID22546
Feature activation+3.67e-04

Pos logit contributions
 other+0.004
 things+0.004
L+0.003
 cars+0.003
Tom+0.003
Neg logit contributions
 extingu-0.008
ccoli-0.008
lette-0.007
work-0.007
 pleasure-0.007
Token her
Token ID607
Feature activation-1.52e-03

Pos logit contributions
 other+0.000
L+0.000
Tom+0.000
Stop+0.000
Tim+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token mom
Token ID1995
Feature activation+4.85e-03

Pos logit contributions
 extingu+0.004
ccoli+0.003
lette+0.003
work+0.003
acket+0.003
Neg logit contributions
 other-0.003
 things-0.002
 cars-0.002
 people-0.002
L-0.002
Token's
Token ID338
Feature activation-1.30e-03

Pos logit contributions
 other+0.006
 things+0.004
L+0.004
 cars+0.004
 people+0.004
Neg logit contributions
 extingu-0.013
ccoli-0.013
 pleasure-0.013
icol-0.013
lette-0.013
Token magn
Token ID7842
Feature activation-3.59e-03

Pos logit contributions
ccoli+0.003
 extingu+0.003
work+0.003
lette+0.003
acket+0.003
Neg logit contributions
 other-0.002
 things-0.002
 people-0.002
 cars-0.002
 me-0.002
Tokenifying
Token ID4035
Feature activation-4.94e-02

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.005
 pleasure+0.005
 crop+0.005
Neg logit contributions
 other-0.008
L-0.008
 things-0.007
 me-0.007
 cars-0.007
Token glass
Token ID5405
Feature activation+3.93e-04

Pos logit contributions
lette+0.109
ccoli+0.109
 extingu+0.104
urger+0.101
work+0.101
Neg logit contributions
 other-0.087
 things-0.075
L-0.074
 me-0.074
 Lily-0.074
Token to
Token ID284
Feature activation+7.01e-03

Pos logit contributions
 other+0.000
 things+0.000
L+0.000
 cars+0.000
 people+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
 pleasure-0.001
work-0.001
Token take
Token ID1011
Feature activation+3.05e-03

Pos logit contributions
 other+0.012
L+0.010
 things+0.010
 cars+0.009
 people+0.009
Neg logit contributions
 extingu-0.016
ccoli-0.015
lette-0.015
work-0.015
 pleasure-0.014
Token a
Token ID257
Feature activation+6.41e-03

Pos logit contributions
 other+0.003
 things+0.002
L+0.002
 people+0.002
 cars+0.002
Neg logit contributions
 extingu-0.009
ccoli-0.009
work-0.009
lette-0.009
 pleasure-0.008
Token closer
Token ID5699
Feature activation+5.92e-03

Pos logit contributions
 other+0.011
L+0.009
 things+0.009
 cars+0.009
Tim+0.008
Neg logit contributions
 extingu-0.014
ccoli-0.014
lette-0.014
work-0.014
 pleasure-0.013
Token and
Token ID290
Feature activation+5.09e-03

Pos logit contributions
 other+0.015
Stop+0.014
L+0.014
 cars+0.013
 Lia+0.013
Neg logit contributions
lette-0.021
 extingu-0.021
ccoli-0.021
work-0.020
urger-0.020
Token handed
Token ID10158
Feature activation+3.53e-03

Pos logit contributions
 other+0.008
L+0.007
 she+0.007
 Lily+0.007
 things+0.006
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.012
work-0.011
 pleasure-0.011
Token him
Token ID683
Feature activation-5.04e-03

Pos logit contributions
Tim+0.003
Fine+0.003
L+0.003
Stop+0.003
Too+0.003
Neg logit contributions
ccoli-0.009
 extingu-0.008
work-0.008
lette-0.008
 pleasure-0.008
Token the
Token ID262
Feature activation+2.27e-03

Pos logit contributions
 extingu+0.015
ccoli+0.015
lette+0.015
icol+0.015
 pleasure+0.015
Neg logit contributions
 other-0.005
 things-0.004
 cars-0.003
L-0.003
 people-0.003
Token magn
Token ID7842
Feature activation-4.99e-03

Pos logit contributions
 other+0.004
L+0.004
Tom+0.004
Tim+0.004
Stop+0.004
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Tokenifying
Token ID4035
Feature activation-4.74e-02

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.006
 pleasure+0.006
 crop+0.006
Neg logit contributions
 other-0.012
L-0.011
 things-0.011
 cars-0.011
Tom-0.010
Token glass
Token ID5405
Feature activation+3.12e-03

Pos logit contributions
ccoli+0.095
lette+0.093
 extingu+0.093
urger+0.086
work+0.086
Neg logit contributions
 other-0.092
L-0.082
 things-0.080
 cars-0.076
 Lily-0.076
Token.
Token ID13
Feature activation+2.44e-03

Pos logit contributions
 other+0.003
 things+0.003
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.009
work-0.008
 pleasure-0.008
Token
Token ID198
Feature activation+3.19e-04

Pos logit contributions
 other+0.003
 things+0.003
 cars+0.003
 people+0.003
Stop+0.003
Neg logit contributions
ccoli-0.006
 extingu-0.006
 pleasure-0.006
ugg-0.005
lette-0.005
Token
Token ID198
Feature activation+1.44e-03

Pos logit contributions
 things+0.000
 other+0.000
 cars+0.000
 cheating+0.000
 balls+0.000
Neg logit contributions
ccoli-0.001
 extingu-0.001
apa-0.001
ugg-0.001
ossom-0.001
TokenHe
Token ID1544
Feature activation+4.00e-03

Pos logit contributions
 things+0.002
 cars+0.002
 other+0.002
 bikes+0.002
 animals+0.002
Neg logit contributions
ccoli-0.003
ugg-0.003
ossom-0.003
 extingu-0.003
lette-0.003
Token Everyday
Token ID48154
Feature activation-1.32e-02

Pos logit contributions
 extingu+0.016
ccoli+0.015
lette+0.015
urger+0.015
ossom+0.014
Neg logit contributions
 other-0.007
 she-0.006
L-0.006
 people-0.005
 me-0.005
Token he
Token ID339
Feature activation+9.94e-04

Pos logit contributions
 extingu+0.041
ccoli+0.040
lette+0.040
work+0.039
 pleasure+0.038
Neg logit contributions
 other-0.010
L-0.007
 things-0.007
 cars-0.007
Tom-0.006
Token would
Token ID561
Feature activation-3.99e-03

Pos logit contributions
 other+0.001
 things+0.001
 me+0.001
 people+0.001
 cars+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.002
work-0.002
lette-0.002
 pleasure-0.002
Token take
Token ID1011
Feature activation-2.54e-03

Pos logit contributions
 extingu+0.010
ccoli+0.010
icol+0.009
ugg+0.009
work+0.009
Neg logit contributions
 other-0.006
 me-0.005
 people-0.005
 she-0.005
 things-0.005
Token the
Token ID262
Feature activation+3.35e-04

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.003
 things-0.002
L-0.002
 cars-0.002
 people-0.002
Token av
Token ID1196
Feature activation-4.74e-02

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.000
 people+0.000
 kids+0.000
Neg logit contributions
ccoli-0.001
 extingu-0.001
lette-0.001
ossom-0.001
work-0.001
Tokenoc
Token ID420
Feature activation-1.87e-02

Pos logit contributions
 extingu+0.088
ccoli+0.085
work+0.081
lette+0.079
 pleasure+0.079
Neg logit contributions
 other-0.095
 things-0.086
L-0.086
 cars-0.082
Tom-0.081
Tokenados
Token ID22484
Feature activation-9.00e-03

Pos logit contributions
 extingu+0.040
ccoli+0.039
 pleasure+0.037
lette+0.036
ossom+0.036
Neg logit contributions
 other-0.031
 things-0.029
 cars-0.027
 people-0.027
 me-0.025
Token and
Token ID290
Feature activation+4.87e-03

Pos logit contributions
 extingu+0.024
ccoli+0.023
lette+0.023
icol+0.022
 pleasure+0.022
Neg logit contributions
 other-0.012
 things-0.010
 she-0.009
 cars-0.009
 people-0.009
Token scatter
Token ID41058
Feature activation-5.00e-03

Pos logit contributions
 other+0.008
 things+0.007
 people+0.007
 cars+0.007
L+0.007
Neg logit contributions
 extingu-0.011
ccoli-0.011
lette-0.010
work-0.010
ugg-0.010
Token them
Token ID606
Feature activation-9.30e-05

Pos logit contributions
 extingu+0.016
ccoli+0.015
lette+0.015
work+0.015
 pleasure+0.015
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.002
Tom-0.002
Token hat
Token ID6877
Feature activation+6.15e-03

Pos logit contributions
 extingu+0.002
lette+0.002
ccoli+0.002
work+0.002
 pleasure+0.001
Neg logit contributions
 other-0.001
 things-0.001
 cars-0.001
L-0.001
 people-0.001
Token and
Token ID290
Feature activation-7.97e-04

Pos logit contributions
 other+0.006
L+0.005
 things+0.005
 cars+0.004
Tom+0.004
Neg logit contributions
 extingu-0.018
ccoli-0.018
lette-0.017
 pleasure-0.017
work-0.017
Token grabbed
Token ID13176
Feature activation-6.61e-03

Pos logit contributions
 extingu+0.002
lette+0.002
ccoli+0.002
work+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 people-0.001
Token her
Token ID607
Feature activation-5.62e-03

Pos logit contributions
 extingu+0.020
ccoli+0.020
lette+0.019
work+0.019
 pleasure+0.019
Neg logit contributions
 other-0.006
 things-0.004
L-0.004
 cars-0.004
Tom-0.004
Token magn
Token ID7842
Feature activation-3.69e-03

Pos logit contributions
 extingu+0.013
lette+0.013
ccoli+0.012
work+0.012
 pleasure+0.012
Neg logit contributions
 other-0.009
 things-0.008
 cars-0.008
L-0.008
 people-0.007
Tokenifying
Token ID4035
Feature activation-4.73e-02

Pos logit contributions
ccoli+0.005
 extingu+0.005
 pleasure+0.005
 crop+0.005
lette+0.005
Neg logit contributions
 other-0.008
L-0.008
 cars-0.007
 me-0.007
Tom-0.007
Token glass
Token ID5405
Feature activation+9.36e-04

Pos logit contributions
ccoli+0.104
lette+0.104
 extingu+0.098
work+0.097
urger+0.096
Neg logit contributions
 other-0.084
 Lily-0.075
L-0.074
 me-0.072
 things-0.071
Token.
Token ID13
Feature activation+3.30e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
ccoli-0.003
 extingu-0.003
lette-0.003
 pleasure-0.003
work-0.003
Token She
Token ID1375
Feature activation+6.03e-03

Pos logit contributions
 other+0.004
 things+0.003
 cars+0.003
L+0.003
 people+0.003
Neg logit contributions
ccoli-0.009
 extingu-0.009
 pleasure-0.008
lette-0.008
work-0.008
Token was
Token ID373
Feature activation-6.90e-04

Pos logit contributions
 other+0.009
L+0.008
Tom+0.007
 cars+0.007
 things+0.007
Neg logit contributions
 extingu-0.015
lette-0.014
 pleasure-0.014
work-0.014
ccoli-0.014
Token ready
Token ID3492
Feature activation-6.44e-04

Pos logit contributions
 extingu+0.002
ccoli+0.002
work+0.002
lette+0.002
acket+0.002
Neg logit contributions
 other-0.001
 things-0.001
 people-0.001
 Lily-0.001
 she-0.001
Token cried
Token ID16896
Feature activation+1.37e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 crop+0.001
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
Stop-0.001
 cars-0.001
Token,
Token ID11
Feature activation+1.96e-03

Pos logit contributions
 other+0.002
 things+0.001
L+0.001
Tom+0.001
Stop+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.003
work-0.003
lette-0.003
 pleasure-0.003
Token dropping
Token ID12047
Feature activation-6.14e-03

Pos logit contributions
 other+0.003
L+0.002
 things+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
work-0.004
 pleasure-0.004
Token the
Token ID262
Feature activation-3.22e-04

Pos logit contributions
 extingu+0.017
ccoli+0.016
work+0.016
lette+0.016
 pleasure+0.015
Neg logit contributions
 other-0.006
L-0.005
Tom-0.005
Tim-0.005
Stop-0.005
Token magn
Token ID7842
Feature activation-3.76e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.001
L-0.000
 things-0.000
 cars-0.000
Tom-0.000
Tokenifying
Token ID4035
Feature activation-4.62e-02

Pos logit contributions
ccoli+0.005
 extingu+0.005
lette+0.004
 pleasure+0.004
work+0.004
Neg logit contributions
 other-0.009
L-0.009
 things-0.009
 cars-0.008
Tom-0.008
Token glass
Token ID5405
Feature activation+2.58e-03

Pos logit contributions
ccoli+0.096
lette+0.094
 extingu+0.091
urger+0.088
work+0.087
Neg logit contributions
 other-0.087
L-0.077
 things-0.075
 Lily-0.072
 cars-0.072
Token.
Token ID13
Feature activation+1.38e-03

Pos logit contributions
 other+0.003
 things+0.002
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
ccoli-0.007
 extingu-0.007
lette-0.007
 pleasure-0.007
work-0.007
Token
Token ID198
Feature activation-1.53e-04

Pos logit contributions
 other+0.002
 things+0.002
 cars+0.001
L+0.001
 people+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.003
 pleasure-0.003
work-0.003
Token
Token ID198
Feature activation+2.18e-03

Pos logit contributions
 extingu+0.000
ccoli+0.000
apa+0.000
ugg+0.000
lette+0.000
Neg logit contributions
 other-0.000
 things-0.000
 balls-0.000
 cars-0.000
Stop-0.000
Token"
Token ID1
Feature activation+5.19e-03

Pos logit contributions
 other+0.002
 things+0.002
 cars+0.002
 balls+0.002
 animals+0.002
Neg logit contributions
lette-0.005
 extingu-0.005
ccoli-0.005
ossom-0.005
 hyd-0.005
Token coming
Token ID2406
Feature activation+3.22e-03

Pos logit contributions
 extingu+0.017
work+0.016
ccoli+0.016
lette+0.016
 hyd+0.015
Neg logit contributions
 other-0.010
L-0.008
 things-0.008
Tom-0.008
 cars-0.008
Token.
Token ID13
Feature activation+2.13e-04

Pos logit contributions
 other+0.004
L+0.004
Tom+0.004
Stop+0.004
 things+0.004
Neg logit contributions
 extingu-0.008
work-0.008
 hyd-0.007
lette-0.007
ccoli-0.007
Token The
Token ID383
Feature activation+1.71e-03

Pos logit contributions
 other+0.000
L+0.000
 things+0.000
Tom+0.000
Stop+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.000
work-0.000
lette-0.000
 pleasure-0.000
Token storm
Token ID6388
Feature activation-5.56e-03

Pos logit contributions
 other+0.003
 things+0.002
 people+0.002
 cars+0.002
 kids+0.002
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Token was
Token ID373
Feature activation+2.30e-03

Pos logit contributions
work+0.014
 extingu+0.013
ccoli+0.013
lette+0.013
flake+0.012
Neg logit contributions
 other-0.007
L-0.006
 things-0.006
Tom-0.006
 Lia-0.006
Token very
Token ID845
Feature activation+5.66e-03

Pos logit contributions
 other+0.003
 things+0.002
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
work-0.006
 pleasure-0.006
Token big
Token ID1263
Feature activation+4.28e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
 cars+0.006
Tom+0.006
Neg logit contributions
 extingu-0.015
ccoli-0.014
lette-0.014
work-0.014
 pleasure-0.014
Token and
Token ID290
Feature activation-1.41e-03

Pos logit contributions
 other+0.005
L+0.005
Tom+0.004
 things+0.004
Tim+0.004
Neg logit contributions
 extingu-0.011
ccoli-0.011
work-0.011
lette-0.011
 pleasure-0.010
Token scary
Token ID14343
Feature activation+2.08e-03

Pos logit contributions
 extingu+0.003
ccoli+0.003
work+0.003
lette+0.003
 hyd+0.003
Neg logit contributions
 other-0.002
L-0.002
Tom-0.001
Stop-0.001
Tim-0.001
Token.
Token ID13
Feature activation-3.52e-03

Pos logit contributions
 other+0.002
L+0.002
Tom+0.002
Tim+0.002
Stop+0.002
Neg logit contributions
 extingu-0.005
work-0.005
ccoli-0.005
lette-0.005
 hyd-0.005
Token Mark
Token ID2940
Feature activation-8.16e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
work+0.007
lette+0.007
 pleasure+0.007
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
Tom-0.005
Stop-0.005
TokenOnce
Token ID7454
Feature activation-1.04e-03

Pos logit contributions
 extingu+0.008
ccoli+0.008
work+0.007
lette+0.007
 pleasure+0.007
Neg logit contributions
 other-0.003
 things-0.003
L-0.003
 cars-0.003
Stop-0.003
Token upon
Token ID2402
Feature activation+4.48e-03

Pos logit contributions
ccoli+0.002
 pleasure+0.002
work+0.002
 extingu+0.002
bie+0.002
Neg logit contributions
Stop-0.002
 other-0.002
Fine-0.001
L-0.001
Tom-0.001
Token a
Token ID257
Feature activation-4.99e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
Stop+0.006
 cars+0.006
Neg logit contributions
 extingu-0.010
ccoli-0.009
lette-0.009
work-0.009
 pleasure-0.009
Token time
Token ID640
Feature activation-5.13e-04

Pos logit contributions
 extingu+0.015
work+0.014
ccoli+0.014
lette+0.014
 pleasure+0.013
Neg logit contributions
 other-0.005
L-0.004
Tom-0.004
 things-0.004
Tim-0.004
Token,
Token ID11
Feature activation+4.35e-04

Pos logit contributions
 extingu+0.001
ccoli+0.001
 pleasure+0.001
work+0.001
lette+0.001
Neg logit contributions
 other-0.001
Stop-0.001
L-0.001
 things-0.001
 people-0.001
Token there
Token ID612
Feature activation+4.39e-03

Pos logit contributions
 other+0.001
 things+0.000
Stop+0.000
 cars+0.000
L+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token was
Token ID373
Feature activation-4.34e-03

Pos logit contributions
 other+0.005
Stop+0.005
L+0.005
 things+0.005
 cars+0.005
Neg logit contributions
 extingu-0.010
ccoli-0.010
lette-0.010
 pleasure-0.010
work-0.010
Token a
Token ID257
Feature activation-2.60e-03

Pos logit contributions
 extingu+0.011
ccoli+0.011
work+0.011
lette+0.011
 pleasure+0.010
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
Tom-0.004
 cars-0.004
Token monkey
Token ID21657
Feature activation+7.14e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.003
L-0.002
 things-0.002
Tom-0.002
 cars-0.002
Token named
Token ID3706
Feature activation-4.46e-03

Pos logit contributions
 other+0.013
 things+0.012
L+0.012
Stop+0.011
 cars+0.011
Neg logit contributions
 extingu-0.014
ccoli-0.014
lette-0.013
work-0.013
 crop-0.012
Token Max
Token ID5436
Feature activation-8.38e-03

Pos logit contributions
ccoli+0.011
 extingu+0.011
work+0.010
lette+0.010
urger+0.010
Neg logit contributions
 other-0.006
 things-0.005
Stop-0.005
L-0.005
 people-0.005
Token They
Token ID1119
Feature activation+1.57e-03

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.002
Stop+0.002
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.002
work-0.002
 hyd-0.002
Token bite
Token ID13197
Feature activation+1.29e-03

Pos logit contributions
 other+0.002
L+0.002
 things+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.003
work-0.003
 pleasure-0.003
Token Tom
Token ID4186
Feature activation+4.29e-03

Pos logit contributions
L+0.002
Fine+0.002
 other+0.002
One+0.002
Tim+0.002
Neg logit contributions
 extingu-0.003
ccoli-0.002
work-0.002
lette-0.002
 hyd-0.002
Token's
Token ID338
Feature activation+2.11e-03

Pos logit contributions
 other+0.006
L+0.005
 things+0.005
Tom+0.005
Stop+0.005
Neg logit contributions
 extingu-0.011
ccoli-0.010
lette-0.010
work-0.010
urger-0.010
Token hand
Token ID1021
Feature activation-1.55e-04

Pos logit contributions
 other+0.004
 things+0.003
 cars+0.003
 people+0.003
L+0.003
Neg logit contributions
 extingu-0.004
ccoli-0.004
work-0.004
lette-0.004
 pleasure-0.004
Token.
Token ID13
Feature activation+5.04e-03

Pos logit contributions
 extingu+0.000
ccoli+0.000
lette+0.000
work+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
Tom-0.000
 cars-0.000
Token Tom
Token ID4186
Feature activation+4.00e-03

Pos logit contributions
 other+0.007
 things+0.007
L+0.006
 people+0.006
 cars+0.006
Neg logit contributions
 extingu-0.011
ccoli-0.011
lette-0.010
work-0.010
 pleasure-0.010
Token drops
Token ID10532
Feature activation-5.72e-03

Pos logit contributions
 other+0.006
L+0.005
Tim+0.005
Tom+0.005
 things+0.005
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.008
work-0.008
 hyd-0.008
Token the
Token ID262
Feature activation-1.43e-03

Pos logit contributions
 extingu+0.015
work+0.013
lette+0.013
 hyd+0.013
ccoli+0.013
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
Fine-0.005
Tom-0.005
Token branch
Token ID8478
Feature activation+5.84e-03

Pos logit contributions
 extingu+0.003
ccoli+0.003
lette+0.003
work+0.003
 pleasure+0.003
Neg logit contributions
 other-0.002
 things-0.002
L-0.002
 cars-0.002
Tom-0.002
Token and
Token ID290
Feature activation+8.89e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
Tom+0.005
 cars+0.005
Neg logit contributions
 extingu-0.016
ccoli-0.015
lette-0.015
work-0.015
 pleasure-0.014
Token After
Token ID2293
Feature activation+8.18e-03

Pos logit contributions
 other+0.004
 things+0.003
L+0.003
 cars+0.003
Tom+0.003
Neg logit contributions
 extingu-0.009
ccoli-0.009
lette-0.008
work-0.008
 pleasure-0.008
Token a
Token ID257
Feature activation+8.37e-03

Pos logit contributions
 other+0.011
L+0.009
 things+0.009
Tom+0.009
Stop+0.009
Neg logit contributions
 extingu-0.021
ccoli-0.020
work-0.020
lette-0.019
 pleasure-0.019
Token few
Token ID1178
Feature activation+1.57e-02

Pos logit contributions
 other+0.010
 things+0.009
 cars+0.008
L+0.008
 people+0.008
Neg logit contributions
 extingu-0.022
ccoli-0.021
lette-0.020
work-0.020
 pleasure-0.020
Token minutes
Token ID2431
Feature activation-9.82e-04

Pos logit contributions
 other+0.021
L+0.020
Tom+0.018
Tim+0.018
Stop+0.018
Neg logit contributions
 extingu-0.037
ccoli-0.036
lette-0.035
 crop-0.034
 pleasure-0.034
Token,
Token ID11
Feature activation-2.56e-03

Pos logit contributions
 extingu+0.002
work+0.002
ccoli+0.002
lette+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
Stop-0.001
L-0.001
 things-0.001
Tom-0.001
Token Max
Token ID5436
Feature activation+6.33e-03

Pos logit contributions
 extingu+0.006
ccoli+0.006
work+0.006
 pleasure+0.006
 hyd+0.005
Neg logit contributions
 other-0.003
L-0.003
Stop-0.003
Fine-0.003
Tom-0.003
Token's
Token ID338
Feature activation+8.28e-03

Pos logit contributions
 other+0.008
L+0.007
 me+0.007
Tom+0.006
 Lily+0.006
Neg logit contributions
 extingu-0.017
work-0.016
lette-0.016
ccoli-0.016
 pleasure-0.016
Token paws
Token ID49193
Feature activation+5.33e-03

Pos logit contributions
 other+0.011
 things+0.009
L+0.009
 cars+0.009
Tom+0.008
Neg logit contributions
 extingu-0.021
ccoli-0.021
work-0.021
lette-0.020
 pleasure-0.019
Token th
Token ID294
Feature activation+6.33e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
 she+0.005
 cars+0.005
Neg logit contributions
ccoli-0.014
 extingu-0.014
lette-0.013
work-0.013
 pleasure-0.013
Tokenawed
Token ID36825
Feature activation+6.43e-03

Pos logit contributions
 other+0.011
L+0.010
 things+0.010
Tom+0.010
 cars+0.009
Neg logit contributions
 extingu-0.013
ccoli-0.013
lette-0.013
work-0.013
 pleasure-0.012
Token and
Token ID290
Feature activation+7.96e-03

Pos logit contributions
 other+0.008
 things+0.007
L+0.007
 cars+0.007
 people+0.007
Neg logit contributions
 extingu-0.016
ccoli-0.016
lette-0.016
work-0.016
 pleasure-0.015
Tokenered
Token ID1068
Feature activation-4.48e-03

Pos logit contributions
 other+0.003
L+0.003
 things+0.003
Tom+0.003
 people+0.003
Neg logit contributions
 extingu-0.004
ccoli-0.004
 pleasure-0.004
lette-0.004
work-0.003
Token his
Token ID465
Feature activation-3.64e-03

Pos logit contributions
ccoli+0.011
 extingu+0.011
lette+0.011
work+0.010
 pleasure+0.010
Neg logit contributions
 other-0.007
 things-0.006
L-0.005
 cars-0.005
 people-0.005
Token parents
Token ID3397
Feature activation-3.66e-03

Pos logit contributions
 extingu+0.009
ccoli+0.008
lette+0.008
work+0.008
ossom+0.008
Neg logit contributions
 other-0.006
 things-0.005
 cars-0.005
 people-0.005
L-0.005
Token until
Token ID1566
Feature activation-3.61e-05

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.009
work+0.009
 pleasure+0.009
Neg logit contributions
 other-0.005
 things-0.004
L-0.004
 cars-0.004
 people-0.004
Token he
Token ID339
Feature activation+4.84e-03

Pos logit contributions
 extingu+0.000
ccoli+0.000
work+0.000
lette+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
Tom-0.000
 cars-0.000
Token finally
Token ID3443
Feature activation+7.85e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
Tom+0.005
 cars+0.005
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.012
work-0.011
 pleasure-0.011
Token got
Token ID1392
Feature activation-4.28e-03

Pos logit contributions
 other+0.013
L+0.011
Tom+0.011
 she+0.010
 things+0.010
Neg logit contributions
 extingu-0.019
ccoli-0.019
lette-0.018
work-0.018
 pleasure-0.017
Token his
Token ID465
Feature activation-3.75e-03

Pos logit contributions
ccoli+0.012
 extingu+0.012
lette+0.012
work+0.011
flake+0.011
Neg logit contributions
 other-0.005
 things-0.004
 she-0.004
 people-0.004
 it-0.004
Token own
Token ID898
Feature activation-3.05e-03

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.008
work+0.008
 hyd+0.008
Neg logit contributions
 other-0.006
 things-0.006
 cars-0.005
 people-0.005
 kids-0.005
Token medal
Token ID18279
Feature activation+1.15e-04

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.006
work+0.006
 hyd+0.006
Neg logit contributions
 other-0.005
 things-0.005
 people-0.004
 cars-0.004
L-0.004
Token,
Token ID11
Feature activation-3.21e-03

Pos logit contributions
 other+0.000
 things+0.000
L+0.000
 she+0.000
 cars+0.000
Neg logit contributions
 extingu-0.000
ccoli-0.000
lette-0.000
 pleasure-0.000
work-0.000
Token do
Token ID466
Feature activation-9.30e-03

Pos logit contributions
 extingu+0.010
ccoli+0.010
lette+0.009
icol+0.009
ugg+0.009
Neg logit contributions
 other-0.005
 things-0.004
 people-0.004
 you-0.004
 me-0.004
Token it
Token ID340
Feature activation-2.94e-03

Pos logit contributions
 extingu+0.028
ccoli+0.028
lette+0.027
work+0.027
 pleasure+0.026
Neg logit contributions
 other-0.008
L-0.006
 things-0.006
 cars-0.005
Tom-0.005
Token,
Token ID11
Feature activation-4.10e-03

Pos logit contributions
 extingu+0.008
ccoli+0.008
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.003
 people-0.003
Token John
Token ID1757
Feature activation-2.60e-03

Pos logit contributions
 extingu+0.011
ccoli+0.011
lette+0.011
work+0.011
ossom+0.010
Neg logit contributions
 other-0.005
 things-0.004
L-0.004
Tom-0.004
 people-0.004
Token.
Token ID13
Feature activation-8.28e-04

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
icol+0.007
Neg logit contributions
 other-0.003
 things-0.002
 people-0.002
L-0.002
 cars-0.002
Token You
Token ID921
Feature activation+1.32e-03

Pos logit contributions
 extingu+0.002
 crop+0.002
lette+0.002
ossom+0.002
ccoli+0.002
Neg logit contributions
 other-0.001
L-0.001
 you-0.001
 Lily-0.001
 things-0.001
Token must
Token ID1276
Feature activation+8.09e-03

Pos logit contributions
 other+0.002
 people+0.001
 you+0.001
 things+0.001
 it+0.001
Neg logit contributions
 extingu-0.004
icol-0.004
ccoli-0.004
lette-0.004
urger-0.003
Token keep
Token ID1394
Feature activation-2.00e-03

Pos logit contributions
 other+0.012
 things+0.009
 people+0.009
 me+0.009
L+0.008
Neg logit contributions
 extingu-0.022
ccoli-0.021
work-0.020
icol-0.020
lette-0.020
Token your
Token ID534
Feature activation-5.83e-04

Pos logit contributions
 extingu+0.006
ccoli+0.006
work+0.005
lette+0.005
ugg+0.005
Neg logit contributions
 other-0.002
 things-0.002
 people-0.002
 me-0.002
 it-0.002
Token newfound
Token ID40000
Feature activation-5.16e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
ugg+0.001
Neg logit contributions
 other-0.001
 things-0.001
 people-0.001
 cars-0.001
 kids-0.001
Token strength
Token ID4202
Feature activation+4.12e-03

Pos logit contributions
ccoli+0.012
 extingu+0.011
work+0.011
lette+0.011
flake+0.010
Neg logit contributions
 other-0.010
 things-0.009
 people-0.008
 cars-0.008
 me-0.007
Token world
Token ID995
Feature activation-4.09e-03

Pos logit contributions
 extingu+0.015
ccoli+0.014
lette+0.014
work+0.014
 pleasure+0.014
Neg logit contributions
 other-0.011
L-0.009
 things-0.009
Tom-0.009
 cars-0.009
Token and
Token ID290
Feature activation-2.30e-03

Pos logit contributions
 extingu+0.011
ccoli+0.011
lette+0.010
 pleasure+0.010
urger+0.010
Neg logit contributions
 other-0.005
 things-0.004
 cars-0.004
L-0.004
 people-0.004
Token have
Token ID423
Feature activation-1.62e-02

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.005
urger+0.005
work+0.005
Neg logit contributions
 other-0.004
 things-0.003
 people-0.003
 cars-0.003
 she-0.003
Token some
Token ID617
Feature activation+1.86e-03

Pos logit contributions
 extingu+0.051
ccoli+0.050
work+0.048
lette+0.048
flake+0.047
Neg logit contributions
 other-0.016
 things-0.012
 cars-0.011
L-0.010
 people-0.010
Token fun
Token ID1257
Feature activation-2.49e-03

Pos logit contributions
 other+0.002
 things+0.002
 cars+0.002
 people+0.002
L+0.002
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
work-0.005
ossom-0.005
Token!"
Token ID2474
Feature activation-5.96e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.006
ugg+0.006
ossom+0.006
Neg logit contributions
 other-0.003
 things-0.003
 cars-0.003
 people-0.003
 kids-0.003
Token Anna
Token ID11735
Feature activation-7.53e-03

Pos logit contributions
 extingu+0.015
ccoli+0.014
work+0.014
lette+0.013
 pleasure+0.013
Neg logit contributions
 other-0.006
Tim-0.006
 things-0.006
L-0.005
Stop-0.005
Token said
Token ID531
Feature activation-2.74e-03

Pos logit contributions
 extingu+0.020
ccoli+0.019
lette+0.019
work+0.019
 pleasure+0.018
Neg logit contributions
 other-0.009
 things-0.008
L-0.008
 cars-0.007
Tom-0.007
Token.
Token ID13
Feature activation+4.04e-04

Pos logit contributions
 extingu+0.008
ccoli+0.008
lette+0.008
work+0.008
 pleasure+0.008
Neg logit contributions
 other-0.002
 things-0.002
L-0.002
 people-0.002
 cars-0.002
Token She
Token ID1375
Feature activation+6.23e-03

Pos logit contributions
 other+0.001
 things+0.000
 cars+0.000
Tim+0.000
L+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token picked
Token ID6497
Feature activation+1.45e-03

Pos logit contributions
 other+0.009
L+0.008
Tom+0.007
 things+0.007
 cars+0.007
Neg logit contributions
 extingu-0.015
ccoli-0.015
lette-0.015
work-0.015
 pleasure-0.014
Token mom
Token ID1995
Feature activation+7.44e-03

Pos logit contributions
 other+0.005
 things+0.004
 cars+0.004
 kids+0.004
 people+0.004
Neg logit contributions
 extingu-0.012
lette-0.011
ccoli-0.011
work-0.011
ossom-0.011
Tokenmy
Token ID1820
Feature activation-5.73e-05

Pos logit contributions
 other+0.012
 things+0.010
L+0.010
 cars+0.010
 people+0.010
Neg logit contributions
ccoli-0.017
 extingu-0.016
lette-0.016
work-0.016
 pleasure-0.015
Token said
Token ID531
Feature activation+8.10e-04

Pos logit contributions
 extingu+0.000
ccoli+0.000
lette+0.000
work+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
Tom-0.000
 people-0.000
Token,
Token ID11
Feature activation-7.17e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
Tom+0.001
 cars+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token "
Token ID366
Feature activation+1.64e-03

Pos logit contributions
 extingu+0.020
ccoli+0.020
lette+0.019
work+0.019
 pleasure+0.019
Neg logit contributions
 other-0.008
L-0.006
 things-0.006
 cars-0.006
Tom-0.006
TokenNot
Token ID3673
Feature activation+4.09e-03

Pos logit contributions
 other+0.002
 cars+0.002
 things+0.002
 Lia+0.002
 balls+0.001
Neg logit contributions
lette-0.004
 extingu-0.004
 pleasure-0.004
work-0.004
ugg-0.004
Token now
Token ID783
Feature activation-4.88e-03

Pos logit contributions
 other+0.005
 things+0.005
L+0.005
 cars+0.004
Tom+0.004
Neg logit contributions
 extingu-0.010
ccoli-0.010
lette-0.010
work-0.010
 pleasure-0.010
Token,
Token ID11
Feature activation-2.94e-03

Pos logit contributions
 extingu+0.015
ccoli+0.014
lette+0.014
work+0.014
 pleasure+0.013
Neg logit contributions
 other-0.005
 things-0.004
L-0.003
 people-0.003
 cars-0.003
Token little
Token ID1310
Feature activation-1.48e-03

Pos logit contributions
 extingu+0.008
ccoli+0.008
lette+0.008
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.004
L-0.003
 things-0.003
Tom-0.003
 cars-0.003
Token one
Token ID530
Feature activation+1.86e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.004
ossom+0.004
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 people-0.001
Token.
Token ID13
Feature activation-1.12e-04

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 people+0.002
 cars+0.002
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
work-0.005
 pleasure-0.004
Token a
Token ID257
Feature activation-1.71e-03

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.005
work+0.005
 pleasure+0.005
Neg logit contributions
 other-0.003
L-0.002
 things-0.002
Tom-0.002
 cars-0.002
Token boy
Token ID2933
Feature activation-2.94e-03

Pos logit contributions
 extingu+0.005
ccoli+0.005
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.002
L-0.002
 things-0.002
Tom-0.002
 cars-0.002
Token.
Token ID13
Feature activation-1.50e-03

Pos logit contributions
 extingu+0.006
ccoli+0.005
lette+0.005
work+0.005
 pleasure+0.005
Neg logit contributions
 other-0.006
 things-0.005
L-0.005
 cars-0.005
Tom-0.005
Token He
Token ID679
Feature activation+1.07e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.002
L-0.001
 things-0.001
Tom-0.001
 cars-0.001
Token had
Token ID550
Feature activation-8.46e-03

Pos logit contributions
 other+0.002
L+0.002
 things+0.001
 cars+0.001
Tom+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token two
Token ID734
Feature activation+9.55e-04

Pos logit contributions
 extingu+0.022
ccoli+0.022
work+0.021
lette+0.021
 pleasure+0.020
Neg logit contributions
 other-0.011
L-0.009
 things-0.009
 cars-0.009
Tom-0.009
Token legs
Token ID7405
Feature activation-5.25e-03

Pos logit contributions
L+0.001
 other+0.001
Stop+0.001
Tom+0.001
Tim+0.001
Neg logit contributions
 extingu-0.002
lette-0.002
ccoli-0.002
 pleasure-0.002
work-0.002
Token and
Token ID290
Feature activation+9.27e-04

Pos logit contributions
 extingu+0.014
ccoli+0.014
lette+0.014
work+0.014
 pleasure+0.013
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
Tom-0.005
Stop-0.004
Token loved
Token ID6151
Feature activation+3.82e-03

Pos logit contributions
 other+0.001
L+0.001
Stop+0.001
 things+0.001
Tom+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token to
Token ID284
Feature activation+2.79e-03

Pos logit contributions
 other+0.006
L+0.005
 things+0.005
 cars+0.005
Tom+0.005
Neg logit contributions
 extingu-0.009
ccoli-0.008
lette-0.008
work-0.008
 pleasure-0.008
Token play
Token ID711
Feature activation+6.09e-04

Pos logit contributions
 other+0.004
 cars+0.003
 things+0.003
 people+0.003
 me+0.003
Neg logit contributions
 extingu-0.007
ccoli-0.007
lette-0.007
work-0.007
 pleasure-0.007
Token an
Token ID281
Feature activation+7.90e-03

Pos logit contributions
 extingu+0.011
ccoli+0.011
lette+0.010
work+0.010
ossom+0.010
Neg logit contributions
 other-0.007
L-0.006
Tom-0.006
 things-0.006
 cars-0.006
Token idea
Token ID2126
Feature activation+2.18e-03

Pos logit contributions
 other+0.015
 things+0.013
 people+0.013
 cars+0.013
 kids+0.012
Neg logit contributions
ccoli-0.017
 extingu-0.016
work-0.016
lette-0.015
ugg-0.015
Token.
Token ID13
Feature activation-3.49e-03

Pos logit contributions
 other+0.003
L+0.002
Stop+0.002
 cars+0.002
 things+0.002
Neg logit contributions
 extingu-0.006
ccoli-0.006
work-0.005
lette-0.005
 pleasure-0.005
Token They
Token ID1119
Feature activation-9.74e-03

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.009
work+0.009
 pleasure+0.009
Neg logit contributions
 other-0.004
L-0.003
 things-0.003
 cars-0.003
 people-0.003
Token decided
Token ID3066
Feature activation+2.21e-03

Pos logit contributions
 extingu+0.026
ccoli+0.026
lette+0.025
 pleasure+0.025
ossom+0.024
Neg logit contributions
 other-0.013
L-0.010
 cars-0.010
 people-0.009
 things-0.009
Token to
Token ID284
Feature activation-7.03e-03

Pos logit contributions
 other+0.004
L+0.003
 things+0.003
 cars+0.003
Tom+0.003
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
work-0.005
 pleasure-0.004
Token make
Token ID787
Feature activation-1.12e-02

Pos logit contributions
ccoli+0.018
ugg+0.017
 extingu+0.017
icol+0.017
urger+0.017
Neg logit contributions
 other-0.011
 people-0.009
 me-0.009
 things-0.009
 you-0.009
Token the
Token ID262
Feature activation+2.17e-03

Pos logit contributions
ccoli+0.032
 extingu+0.032
lette+0.031
work+0.030
ugg+0.030
Neg logit contributions
 other-0.013
 things-0.010
 cars-0.009
 people-0.009
L-0.009
Token water
Token ID1660
Feature activation-8.01e-03

Pos logit contributions
 other+0.004
 things+0.003
 people+0.003
 cars+0.003
 kids+0.003
Neg logit contributions
ccoli-0.005
 extingu-0.005
lette-0.005
work-0.004
ossom-0.004
Token even
Token ID772
Feature activation+3.79e-03

Pos logit contributions
ccoli+0.021
icol+0.020
lette+0.020
 extingu+0.020
ugg+0.019
Neg logit contributions
 other-0.011
 things-0.011
 cars-0.010
 people-0.009
 too-0.009
Token more
Token ID517
Feature activation+6.28e-03

Pos logit contributions
 other+0.003
 things+0.003
L+0.002
 cars+0.002
 people+0.002
Neg logit contributions
 extingu-0.011
ccoli-0.011
lette-0.011
work-0.011
 pleasure-0.011
Token mom
Token ID1995
Feature activation+2.53e-03

Pos logit contributions
 other+0.002
 things+0.001
 cars+0.001
 people+0.001
L+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.003
lette-0.003
work-0.003
ossom-0.003
Token found
Token ID1043
Feature activation-1.47e-03

Pos logit contributions
 other+0.003
L+0.003
Tom+0.003
 me+0.003
 she+0.003
Neg logit contributions
 extingu-0.007
ccoli-0.007
 pleasure-0.007
work-0.006
lette-0.006
Token an
Token ID281
Feature activation+1.15e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.004
 pleasure+0.004
Neg logit contributions
 other-0.001
L-0.001
 things-0.001
 cars-0.001
Tom-0.001
Token old
Token ID1468
Feature activation+4.36e-03

Pos logit contributions
 other+0.002
 things+0.001
 cars+0.001
 people+0.001
L+0.001
Neg logit contributions
ccoli-0.003
 extingu-0.003
lette-0.003
work-0.003
ossom-0.003
Token ash
Token ID12530
Feature activation+3.61e-03

Pos logit contributions
 other+0.006
 things+0.005
L+0.005
 cars+0.005
 people+0.005
Neg logit contributions
ccoli-0.011
 extingu-0.011
lette-0.011
work-0.010
 pleasure-0.010
Tokentr
Token ID2213
Feature activation-2.48e-02

Pos logit contributions
 other+0.009
 things+0.008
 cars+0.008
L+0.008
 she+0.008
Neg logit contributions
ccoli-0.005
 pleasure-0.005
lette-0.005
 extingu-0.005
ossom-0.005
Tokenay
Token ID323
Feature activation+1.75e-03

Pos logit contributions
 extingu+0.046
work+0.043
ccoli+0.042
lette+0.041
 pleasure+0.039
Neg logit contributions
 other-0.052
 things-0.045
L-0.045
 people-0.044
Stop-0.044
Token.
Token ID13
Feature activation+1.47e-03

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.002
Tom+0.002
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.005
 pleasure-0.004
work-0.004
Token She
Token ID1375
Feature activation+3.79e-03

Pos logit contributions
 other+0.002
 things+0.001
L+0.001
Tom+0.001
 cars+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
 pleasure-0.004
work-0.004
Token told
Token ID1297
Feature activation+7.57e-03

Pos logit contributions
 other+0.005
L+0.005
Tom+0.004
Tim+0.004
 things+0.004
Neg logit contributions
 extingu-0.010
ccoli-0.009
 pleasure-0.009
lette-0.009
work-0.009
Token Lily
Token ID20037
Feature activation+1.43e-03

Pos logit contributions
 other+0.006
Tom+0.005
L+0.005
 cars+0.005
Rex+0.005
Neg logit contributions
 extingu-0.020
ccoli-0.020
 pleasure-0.019
work-0.019
lette-0.019
Token and
Token ID290
Feature activation-6.77e-03

Pos logit contributions
 other+0.006
L+0.005
 things+0.005
 cars+0.005
Tom+0.005
Neg logit contributions
 extingu-0.015
ccoli-0.015
lette-0.014
work-0.014
 pleasure-0.014
Token everything
Token ID2279
Feature activation-4.13e-03

Pos logit contributions
 extingu+0.022
ccoli+0.021
lette+0.021
work+0.021
 pleasure+0.020
Neg logit contributions
 other-0.005
 things-0.003
L-0.003
 people-0.003
 cars-0.003
Token looked
Token ID3114
Feature activation-3.42e-03

Pos logit contributions
 extingu+0.012
ccoli+0.012
lette+0.012
 pleasure+0.011
work+0.011
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 people-0.003
 cars-0.003
Token clean
Token ID3424
Feature activation+1.56e-03

Pos logit contributions
 extingu+0.009
ccoli+0.009
lette+0.008
work+0.008
 crop+0.008
Neg logit contributions
 other-0.005
 things-0.004
L-0.004
 cars-0.004
 people-0.004
Token.
Token ID13
Feature activation-1.84e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 people+0.001
 cars+0.001
Neg logit contributions
 extingu-0.005
ccoli-0.005
lette-0.004
work-0.004
 pleasure-0.004
Token Little
Token ID7703
Feature activation-2.35e-02

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.004
 pleasure+0.003
Neg logit contributions
 other-0.003
L-0.003
 things-0.003
Tom-0.003
 cars-0.003
Token Lucy
Token ID22162
Feature activation-9.83e-03

Pos logit contributions
 extingu+0.054
ccoli+0.051
lette+0.050
work+0.049
ossom+0.048
Neg logit contributions
 other-0.038
 things-0.035
L-0.034
Tom-0.033
 Lily-0.033
Token walked
Token ID6807
Feature activation+4.40e-03

Pos logit contributions
 extingu+0.030
work+0.028
ccoli+0.028
 pleasure+0.028
lette+0.028
Neg logit contributions
 other-0.010
L-0.008
 me-0.008
 she-0.007
Tom-0.007
Token to
Token ID284
Feature activation-3.43e-03

Pos logit contributions
 other+0.008
L+0.007
 things+0.007
Tom+0.007
Stop+0.007
Neg logit contributions
 extingu-0.009
ccoli-0.008
work-0.008
lette-0.008
 pleasure-0.008
Token the
Token ID262
Feature activation+6.24e-04

Pos logit contributions
ccoli+0.011
 extingu+0.011
lette+0.010
work+0.010
ossom+0.010
Neg logit contributions
 other-0.003
 things-0.002
L-0.002
 people-0.002
 cars-0.002
Token park
Token ID3952
Feature activation-3.89e-03

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
L+0.001
 people+0.001
Neg logit contributions
ccoli-0.002
 extingu-0.002
lette-0.002
work-0.001
 pleasure-0.001
Token stumbled
Token ID24241
Feature activation+5.01e-03

Pos logit contributions
 other+0.006
L+0.005
Tom+0.005
 things+0.005
Tim+0.004
Neg logit contributions
 extingu-0.014
ccoli-0.014
work-0.014
lette-0.014
 pleasure-0.013
Token across
Token ID1973
Feature activation-2.48e-03

Pos logit contributions
 other+0.008
 things+0.008
 cars+0.007
L+0.007
Stop+0.007
Neg logit contributions
 extingu-0.010
ccoli-0.010
lette-0.010
work-0.009
flake-0.009
Token a
Token ID257
Feature activation-1.58e-03

Pos logit contributions
 extingu+0.006
ccoli+0.006
work+0.006
lette+0.006
 pleasure+0.005
Neg logit contributions
 other-0.003
L-0.003
 things-0.003
Tom-0.003
 cars-0.003
Token big
Token ID1263
Feature activation-3.03e-03

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.004
work+0.003
 pleasure+0.003
Neg logit contributions
 other-0.002
L-0.002
 things-0.002
Tom-0.002
 cars-0.002
Token dish
Token ID9433
Feature activation-1.29e-02

Pos logit contributions
 extingu+0.008
ccoli+0.008
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.004
L-0.003
 things-0.003
 cars-0.003
Tom-0.003
Tokenwasher
Token ID45146
Feature activation-2.20e-02

Pos logit contributions
ccoli+0.036
 extingu+0.036
 pleasure+0.035
lette+0.035
work+0.034
Neg logit contributions
 other-0.014
 things-0.011
L-0.011
 cars-0.011
 people-0.010
Token.
Token ID13
Feature activation-5.10e-04

Pos logit contributions
 extingu+0.063
ccoli+0.063
lette+0.061
 pleasure+0.060
work+0.060
Neg logit contributions
 other-0.022
 things-0.017
L-0.017
 cars-0.016
Tom-0.016
Token It
Token ID632
Feature activation-2.55e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
Tom-0.001
Token was
Token ID373
Feature activation+7.10e-03

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.002
L-0.002
 things-0.002
 cars-0.002
Tom-0.002
Token so
Token ID523
Feature activation+5.52e-03

Pos logit contributions
 other+0.010
 things+0.008
L+0.008
 cars+0.008
 people+0.007
Neg logit contributions
 extingu-0.018
ccoli-0.018
lette-0.018
work-0.017
 pleasure-0.017
Token excited
Token ID6568
Feature activation+1.76e-03

Pos logit contributions
 other+0.007
L+0.006
 things+0.006
 cars+0.006
Tom+0.006
Neg logit contributions
 extingu-0.014
ccoli-0.014
lette-0.014
work-0.014
 pleasure-0.013
Token was
Token ID373
Feature activation+1.15e-03

Pos logit contributions
 other+0.002
 people+0.001
L+0.001
 cars+0.001
 she+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
 pleasure-0.004
work-0.004
Token hungry
Token ID14720
Feature activation+5.09e-04

Pos logit contributions
 other+0.001
 people+0.001
L+0.001
 things+0.001
 cars+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.003
work-0.003
lette-0.003
icol-0.003
Token.
Token ID13
Feature activation-3.73e-03

Pos logit contributions
 other+0.001
L+0.000
 things+0.000
Tom+0.000
 cars+0.000
Neg logit contributions
 extingu-0.001
ccoli-0.001
lette-0.001
work-0.001
 pleasure-0.001
Token The
Token ID383
Feature activation-1.99e-03

Pos logit contributions
 extingu+0.010
ccoli+0.010
lette+0.010
work+0.010
 pleasure+0.010
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.003
Tom-0.003
Token re
Token ID302
Feature activation-1.31e-02

Pos logit contributions
 extingu+0.004
ccoli+0.004
lette+0.003
work+0.003
 pleasure+0.003
Neg logit contributions
 other-0.004
L-0.004
 things-0.004
 cars-0.003
Tom-0.003
Tokenind
Token ID521
Feature activation-2.77e-02

Pos logit contributions
 extingu+0.025
ccoli+0.025
lette+0.024
work+0.024
 pleasure+0.023
Neg logit contributions
 other-0.025
L-0.023
 things-0.022
 cars-0.022
Tom-0.022
Tokeneer
Token ID28153
Feature activation-2.00e-04

Pos logit contributions
 extingu+0.036
ccoli+0.036
 pleasure+0.035
lette+0.033
icol+0.033
Neg logit contributions
 other-0.072
 things-0.065
 people-0.064
 cars-0.063
L-0.062
Token knew
Token ID2993
Feature activation+7.82e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
icol+0.001
 pleasure+0.001
lette+0.001
Neg logit contributions
 other-0.000
 things-0.000
 people-0.000
 she-0.000
 cars-0.000
Token where
Token ID810
Feature activation+1.09e-04

Pos logit contributions
 other+0.007
L+0.006
 things+0.005
Tom+0.005
 cars+0.005
Neg logit contributions
 extingu-0.023
ccoli-0.022
lette-0.022
work-0.022
 pleasure-0.021
Token to
Token ID284
Feature activation-3.83e-03

Pos logit contributions
 other+0.000
 things+0.000
L+0.000
 cars+0.000
 people+0.000
Neg logit contributions
 extingu-0.000
ccoli-0.000
lette-0.000
work-0.000
 pleasure-0.000
Token find
Token ID1064
Feature activation-1.62e-02

Pos logit contributions
ccoli+0.009
 extingu+0.009
work+0.008
lette+0.008
icol+0.008
Neg logit contributions
 other-0.007
 people-0.006
 things-0.006
L-0.005
Tim-0.005
Token and
Token ID290
Feature activation+1.28e-02

Pos logit contributions
 other+0.001
L+0.001
 things+0.001
Tom+0.001
 cars+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token looked
Token ID3114
Feature activation-5.30e-03

Pos logit contributions
 other+0.017
 things+0.014
 she+0.013
L+0.013
 people+0.013
Neg logit contributions
 extingu-0.035
work-0.033
lette-0.033
ccoli-0.033
 pleasure-0.031
Token out
Token ID503
Feature activation-8.03e-03

Pos logit contributions
 extingu+0.015
lette+0.015
ccoli+0.014
work+0.014
flake+0.014
Neg logit contributions
 other-0.006
 things-0.006
 me-0.005
 she-0.005
 people-0.005
Token of
Token ID286
Feature activation-1.34e-02

Pos logit contributions
 extingu+0.021
ccoli+0.020
lette+0.020
work+0.020
 pleasure+0.020
Neg logit contributions
 other-0.009
L-0.008
Tom-0.008
Tim-0.008
 things-0.007
Token the
Token ID262
Feature activation-5.28e-03

Pos logit contributions
ccoli+0.042
 extingu+0.041
lette+0.041
work+0.041
ossom+0.039
Neg logit contributions
 other-0.011
 things-0.008
L-0.007
 cars-0.007
 people-0.007
Token window
Token ID4324
Feature activation-2.21e-02

Pos logit contributions
 extingu+0.015
ccoli+0.014
lette+0.014
work+0.014
 pleasure+0.013
Neg logit contributions
 other-0.007
 things-0.006
 people-0.005
 cars-0.005
 me-0.005
Token.
Token ID13
Feature activation-6.54e-05

Pos logit contributions
ccoli+0.067
 extingu+0.066
lette+0.064
 pleasure+0.063
work+0.062
Neg logit contributions
 other-0.019
 things-0.014
 cars-0.013
 people-0.013
L-0.013
Token He
Token ID679
Feature activation-2.54e-04

Pos logit contributions
 extingu+0.000
ccoli+0.000
lette+0.000
work+0.000
 pleasure+0.000
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
 cars-0.000
Tom-0.000
Token saw
Token ID2497
Feature activation-5.39e-03

Pos logit contributions
 extingu+0.001
ccoli+0.001
lette+0.001
work+0.001
 pleasure+0.001
Neg logit contributions
 other-0.000
L-0.000
 things-0.000
 cars-0.000
Tom-0.000
Token some
Token ID617
Feature activation-2.90e-03

Pos logit contributions
 extingu+0.014
ccoli+0.014
lette+0.014
work+0.013
 pleasure+0.013
Neg logit contributions
 other-0.007
 things-0.006
L-0.006
 cars-0.005
Tom-0.005
Token friendly
Token ID8030
Feature activation-1.89e-03

Pos logit contributions
 extingu+0.008
ccoli+0.008
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.003
Tom-0.003
Token She
Token ID1375
Feature activation+3.73e-03

Pos logit contributions
 extingu+0.003
ccoli+0.002
lette+0.002
work+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
L-0.001
 things-0.001
 cars-0.000
 people-0.000
Token took
Token ID1718
Feature activation-4.55e-03

Pos logit contributions
 other+0.006
L+0.005
 cars+0.005
 me+0.004
 things+0.004
Neg logit contributions
 extingu-0.009
lette-0.009
ccoli-0.009
 pleasure-0.009
work-0.009
Token out
Token ID503
Feature activation-1.04e-02

Pos logit contributions
 extingu+0.012
ccoli+0.011
lette+0.011
work+0.011
 pleasure+0.011
Neg logit contributions
 other-0.006
L-0.005
 things-0.005
 cars-0.005
Tom-0.005
Token a
Token ID257
Feature activation+5.14e-03

Pos logit contributions
 extingu+0.029
ccoli+0.029
lette+0.029
work+0.028
 pleasure+0.028
Neg logit contributions
 other-0.011
 things-0.009
L-0.009
 cars-0.008
 people-0.008
Token magn
Token ID7842
Feature activation-4.12e-03

Pos logit contributions
 other+0.007
 things+0.006
L+0.006
 cars+0.005
 people+0.005
Neg logit contributions
lette-0.013
 extingu-0.013
ccoli-0.013
work-0.013
ugg-0.012
Tokenifying
Token ID4035
Feature activation-5.09e-02

Pos logit contributions
ccoli+0.006
 extingu+0.006
 pleasure+0.006
lette+0.006
 crop+0.006
Neg logit contributions
 other-0.009
L-0.008
 cars-0.008
 things-0.008
 me-0.008
Token glass
Token ID5405
Feature activation-1.07e-03

Pos logit contributions
ccoli+0.119
lette+0.118
urger+0.112
 extingu+0.110
 pleasure+0.104
Neg logit contributions
 other-0.086
L-0.075
 Lily-0.072
 me-0.072
 things-0.071
Token and
Token ID290
Feature activation+5.05e-03

Pos logit contributions
ccoli+0.003
lette+0.003
 extingu+0.003
 pleasure+0.003
urger+0.003
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 she-0.001
 cars-0.001
Token looked
Token ID3114
Feature activation-5.02e-03

Pos logit contributions
 other+0.008
 things+0.007
 she+0.007
 people+0.006
L+0.006
Neg logit contributions
 extingu-0.012
ccoli-0.012
lette-0.012
work-0.012
urger-0.011
Token very
Token ID845
Feature activation+4.13e-03

Pos logit contributions
 extingu+0.013
ccoli+0.013
lette+0.013
work+0.013
acket+0.012
Neg logit contributions
 other-0.007
 things-0.006
 people-0.005
 she-0.005
 it-0.005
Token carefully
Token ID7773
Feature activation+2.84e-03

Pos logit contributions
 other+0.006
 things+0.005
 people+0.005
L+0.005
 cars+0.005
Neg logit contributions
 extingu-0.010
ccoli-0.010
lette-0.010
work-0.010
urger-0.009
Token it
Token ID340
Feature activation+3.23e-06

Pos logit contributions
 extingu+0.014
ccoli+0.014
work+0.014
 pleasure+0.014
lette+0.014
Neg logit contributions
 other-0.004
L-0.003
Stop-0.003
Tom-0.003
 things-0.003
Token she
Token ID673
Feature activation+1.58e-02

Feature not active on prev token;
no predictions were affected.
Token had
Token ID550
Feature activation-5.20e-03

Pos logit contributions
 other+0.020
L+0.016
 things+0.016
 cars+0.015
 people+0.015
Neg logit contributions
 extingu-0.044
ccoli-0.043
lette-0.042
work-0.041
 pleasure-0.040
Token a
Token ID257
Feature activation+7.82e-04

Pos logit contributions
 extingu+0.012
ccoli+0.012
lette+0.011
work+0.011
 pleasure+0.011
Neg logit contributions
 other-0.008
L-0.007
 things-0.007
 cars-0.006
Tom-0.006
Token magn
Token ID7842
Feature activation-5.39e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Tokenifying
Token ID4035
Feature activation-5.18e-02

Pos logit contributions
ccoli+0.008
 extingu+0.008
lette+0.007
 hyd+0.007
 pleasure+0.007
Neg logit contributions
 other-0.012
L-0.011
 cars-0.011
 she-0.011
 things-0.011
Token glass
Token ID5405
Feature activation+5.92e-04

Pos logit contributions
ccoli+0.122
lette+0.118
 extingu+0.115
urger+0.113
work+0.107
Neg logit contributions
 other-0.087
L-0.075
 me-0.074
 Lily-0.073
 things-0.069
Token,
Token ID11
Feature activation+1.58e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.000
 cars+0.000
 she+0.000
Neg logit contributions
ccoli-0.002
 extingu-0.002
lette-0.002
 pleasure-0.002
work-0.002
Token a
Token ID257
Feature activation+8.33e-04

Pos logit contributions
 other+0.002
 things+0.002
L+0.002
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.004
ccoli-0.004
lette-0.004
work-0.004
 pleasure-0.004
Token tiny
Token ID7009
Feature activation+3.86e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token torch
Token ID28034
Feature activation-4.76e-03

Pos logit contributions
 other+0.006
 things+0.005
L+0.005
 cars+0.005
 people+0.005
Neg logit contributions
ccoli-0.009
 extingu-0.009
lette-0.009
work-0.008
 pleasure-0.008
Token hat
Token ID6877
Feature activation+6.15e-03

Pos logit contributions
 extingu+0.002
lette+0.002
ccoli+0.002
work+0.002
 pleasure+0.001
Neg logit contributions
 other-0.001
 things-0.001
 cars-0.001
L-0.001
 people-0.001
Token and
Token ID290
Feature activation-7.97e-04

Pos logit contributions
 other+0.006
L+0.005
 things+0.005
 cars+0.004
Tom+0.004
Neg logit contributions
 extingu-0.018
ccoli-0.018
lette-0.017
 pleasure-0.017
work-0.017
Token grabbed
Token ID13176
Feature activation-6.61e-03

Pos logit contributions
 extingu+0.002
lette+0.002
ccoli+0.002
work+0.002
 pleasure+0.002
Neg logit contributions
 other-0.001
 things-0.001
L-0.001
 cars-0.001
 people-0.001
Token her
Token ID607
Feature activation-5.62e-03

Pos logit contributions
 extingu+0.020
ccoli+0.020
lette+0.019
work+0.019
 pleasure+0.019
Neg logit contributions
 other-0.006
 things-0.004
L-0.004
 cars-0.004
Tom-0.004
Token magn
Token ID7842
Feature activation-3.69e-03

Pos logit contributions
 extingu+0.013
lette+0.013
ccoli+0.012
work+0.012
 pleasure+0.012
Neg logit contributions
 other-0.009
 things-0.008
 cars-0.008
L-0.008
 people-0.007
Tokenifying
Token ID4035
Feature activation-4.73e-02

Pos logit contributions
ccoli+0.005
 extingu+0.005
 pleasure+0.005
 crop+0.005
lette+0.005
Neg logit contributions
 other-0.008
L-0.008
 cars-0.007
 me-0.007
Tom-0.007
Token glass
Token ID5405
Feature activation+9.36e-04

Pos logit contributions
ccoli+0.104
lette+0.104
 extingu+0.098
work+0.097
urger+0.096
Neg logit contributions
 other-0.084
 Lily-0.075
L-0.074
 me-0.072
 things-0.071
Token.
Token ID13
Feature activation+3.30e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
ccoli-0.003
 extingu-0.003
lette-0.003
 pleasure-0.003
work-0.003
Token She
Token ID1375
Feature activation+6.03e-03

Pos logit contributions
 other+0.004
 things+0.003
 cars+0.003
L+0.003
 people+0.003
Neg logit contributions
ccoli-0.009
 extingu-0.009
 pleasure-0.008
lette-0.008
work-0.008
Token was
Token ID373
Feature activation-6.90e-04

Pos logit contributions
 other+0.009
L+0.008
Tom+0.007
 cars+0.007
 things+0.007
Neg logit contributions
 extingu-0.015
lette-0.014
 pleasure-0.014
work-0.014
ccoli-0.014
Token ready
Token ID3492
Feature activation-6.44e-04

Pos logit contributions
 extingu+0.002
ccoli+0.002
work+0.002
lette+0.002
acket+0.002
Neg logit contributions
 other-0.001
 things-0.001
 people-0.001
 Lily-0.001
 she-0.001
Token Everyday
Token ID48154
Feature activation-1.32e-02

Pos logit contributions
 extingu+0.016
ccoli+0.015
lette+0.015
urger+0.015
ossom+0.014
Neg logit contributions
 other-0.007
 she-0.006
L-0.006
 people-0.005
 me-0.005
Token he
Token ID339
Feature activation+9.94e-04

Pos logit contributions
 extingu+0.041
ccoli+0.040
lette+0.040
work+0.039
 pleasure+0.038
Neg logit contributions
 other-0.010
L-0.007
 things-0.007
 cars-0.007
Tom-0.006
Token would
Token ID561
Feature activation-3.99e-03

Pos logit contributions
 other+0.001
 things+0.001
 me+0.001
 people+0.001
 cars+0.001
Neg logit contributions
 extingu-0.003
ccoli-0.002
work-0.002
lette-0.002
 pleasure-0.002
Token take
Token ID1011
Feature activation-2.54e-03

Pos logit contributions
 extingu+0.010
ccoli+0.010
icol+0.009
ugg+0.009
work+0.009
Neg logit contributions
 other-0.006
 me-0.005
 people-0.005
 she-0.005
 things-0.005
Token the
Token ID262
Feature activation+3.35e-04

Pos logit contributions
 extingu+0.007
ccoli+0.007
lette+0.007
work+0.007
 pleasure+0.007
Neg logit contributions
 other-0.003
 things-0.002
L-0.002
 cars-0.002
 people-0.002
Token av
Token ID1196
Feature activation-4.74e-02

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.000
 people+0.000
 kids+0.000
Neg logit contributions
ccoli-0.001
 extingu-0.001
lette-0.001
ossom-0.001
work-0.001
Tokenoc
Token ID420
Feature activation-1.87e-02

Pos logit contributions
 extingu+0.088
ccoli+0.085
work+0.081
lette+0.079
 pleasure+0.079
Neg logit contributions
 other-0.095
 things-0.086
L-0.086
 cars-0.082
Tom-0.081
Tokenados
Token ID22484
Feature activation-9.00e-03

Pos logit contributions
 extingu+0.040
ccoli+0.039
 pleasure+0.037
lette+0.036
ossom+0.036
Neg logit contributions
 other-0.031
 things-0.029
 cars-0.027
 people-0.027
 me-0.025
Token and
Token ID290
Feature activation+4.87e-03

Pos logit contributions
 extingu+0.024
ccoli+0.023
lette+0.023
icol+0.022
 pleasure+0.022
Neg logit contributions
 other-0.012
 things-0.010
 she-0.009
 cars-0.009
 people-0.009
Token scatter
Token ID41058
Feature activation-5.00e-03

Pos logit contributions
 other+0.008
 things+0.007
 people+0.007
 cars+0.007
L+0.007
Neg logit contributions
 extingu-0.011
ccoli-0.011
lette-0.010
work-0.010
ugg-0.010
Token them
Token ID606
Feature activation-9.30e-05

Pos logit contributions
 extingu+0.016
ccoli+0.015
lette+0.015
work+0.015
 pleasure+0.015
Neg logit contributions
 other-0.004
 things-0.003
L-0.003
 cars-0.002
Tom-0.002
Token
Token ID198
Feature activation+1.91e-03

Pos logit contributions
 other+0.001
 things+0.001
 cars+0.001
 balls+0.001
Stop+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
TokenHe
Token ID1544
Feature activation+2.68e-03

Pos logit contributions
 other+0.002
 things+0.002
 cars+0.002
 balls+0.002
 kids+0.002
Neg logit contributions
 extingu-0.005
lette-0.004
ccoli-0.004
work-0.004
ossom-0.004
Token pointed
Token ID6235
Feature activation+6.90e-04

Pos logit contributions
 other+0.004
L+0.004
Tom+0.003
 things+0.003
 cars+0.003
Neg logit contributions
 extingu-0.006
ccoli-0.006
lette-0.006
 pleasure-0.006
work-0.006
Token his
Token ID465
Feature activation-4.98e-03

Pos logit contributions
 other+0.001
 things+0.001
L+0.001
 cars+0.001
 people+0.001
Neg logit contributions
 extingu-0.002
ccoli-0.002
lette-0.002
work-0.002
 pleasure-0.002
Token magn
Token ID7842
Feature activation-6.38e-03

Pos logit contributions
 extingu+0.011
ccoli+0.010
lette+0.010
work+0.010
 pleasure+0.010
Neg logit contributions
 other-0.009
 things-0.008
L-0.008
 cars-0.007
 people-0.007
Tokenifying
Token ID4035
Feature activation-5.13e-02

Pos logit contributions
ccoli+0.010
 extingu+0.009
 pleasure+0.009
lette+0.009
 crop+0.009
Neg logit contributions
 other-0.014
 things-0.013
L-0.013
 cars-0.013
 me-0.012
Token glass
Token ID5405
Feature activation-6.04e-04

Pos logit contributions
ccoli+0.113
lette+0.112
 extingu+0.107
urger+0.104
work+0.103
Neg logit contributions
 other-0.091
 things-0.079
L-0.078
 Lily-0.076
 me-0.074
Token at
Token ID379
Feature activation+5.05e-03

Pos logit contributions
ccoli+0.002
 extingu+0.002
lette+0.002
 pleasure+0.002
work+0.002
Neg logit contributions
 other-0.001
 things-0.001
 cars-0.001
L-0.001
 people-0.001
Token different
Token ID1180
Feature activation+6.05e-03

Pos logit contributions
 other+0.005
L+0.005
Stop+0.004
Tom+0.004
Tim+0.004
Neg logit contributions
 extingu-0.014
work-0.014
ccoli-0.013
lette-0.013
 pleasure-0.013
Token classmates
Token ID28999
Feature activation+8.68e-03

Pos logit contributions
L+0.007
 other+0.007
Tom+0.007
Tim+0.007
Stop+0.007
Neg logit contributions
 extingu-0.015
lette-0.014
work-0.014
ccoli-0.014
 pleasure-0.013
Token and
Token ID290
Feature activation+3.22e-04

Pos logit contributions
 other+0.010
L+0.008
 things+0.008
Tom+0.007
 cars+0.007
Neg logit contributions
 extingu-0.024
ccoli-0.023
lette-0.023
work-0.022
 pleasure-0.022