TOP ACTIVATIONS
MAX = 6.743

and she was the happiest she 'd ever been . <|endoftext|>
was much greater than any she could ever imagine . She
it was the most fun she ever had . She can
the sky , the dragon ruled over the land with might
was proud of the clothes she had worn . <|endoftext|> Once
saw that the land was ruled by a mean monster .
be careful with the things she loved . <|endoftext|> Once upon
Fire is not a toy " on the bottom . They
par rot was the happiest he had ever been . <|endoftext|>
and mysterious land that was ruled by a powerful witch .
always proud of the things she could do , even when
rot was the happiest he had ever been . <|endoftext|> Once
was the most interesting thing she had ever seen . <|endoftext|>
was the most beautiful horn she 'd ever seen .
away from any dangerous things she might find . <|endoftext|> Once
a lot more magical than she first thought . <|endoftext|> The
is the most beautiful card I ever got . You are
daddy all about the fun she had . <|endoftext|> Once upon
than the other penn ies she had . She wondered what
is the most beautiful rose she has ever seen . She

MIN ACTIVATIONS
MIN = -6.844

number . " I weigh 40 pounds . That 's a
could only watch TV for 30 minutes . Lily asked Mitt
named Jack . He was 7 years old and he wanted
said 3 plus 4 makes 7 . The little girl was
" I have six blocks in my row ,"
. " All erg ies !" shouted Anne .
's set the timer for 30 minutes ," dad said .
Max and Cole loved to play outside .
clown s juggling and ac rob ats doing flips . But
Lily . " I guess 30 ." They asked
points . When he reached fifteen points , he shouted ,
I can ! There are 8 36 leaves on this tree
the oven . After about 15 minutes , she took the
. She tried counting to 10 but then she got stuck
the floor . She had 4 of them , and felt
" I got to number six !" Emma said . "
the ground . It had six legs and two feel ers
a mess . Lily put six jelly cups in a basket
, six . She has six past els . Four are
two brothers named Max and Cole . Max

INTERVAL 3.346 - 6.743
CONTAINS 0.631%

He was grateful for everything he had and everyone he knew
be good friends again ! <|endoftext|> Once there was a girl
and went home happy . <|endoftext|> Once upon a time ,
Lily 's tooth was too damaged and had to be removed
girl started to cry . <|endoftext|> Once there was a small

INTERVAL -0.050 - 3.346
CONTAINS 50.437%

the eye . He showed the monster that he was brave
. Her mom said no , it was too dangerous .
the block and break my house . It will be noisy
and then it flew away . Conf used
he faced it with courage . The moral of the story

INTERVAL -3.447 - -0.050
CONTAINS 48.613%

she found a new friend . It was a big ,
feeling proud of herself . H annah and
" Let 's make a cake !" Lily said
with her toys , but accidentally split her favorite doll in
. He wanted to make things better but didn 't know

INTERVAL -6.844 - -3.447
CONTAINS 0.319%

green . " Look at my pattern , Tom
Tim my 's team practiced hard and they were excited to
If you don â t do what I
to her mom . " Look , mom , I made
will be ready in about 30 minutes . You can help
Token and
Token ID290
Feature activation-1.91e-01

Pos logit contributions
arl+0.015
rou+0.014
run+0.014
ounces+0.012
om+0.012
Neg logit contributions
 exercises-0.016
ured-0.016
 poses-0.016
 therapist-0.016
-0.015
Token she
Token ID673
Feature activation+7.01e-01

Pos logit contributions
ured+0.004
+0.004
 exercises+0.004
seek+0.004
 poses+0.004
Neg logit contributions
rou-0.004
run-0.004
arl-0.004
ounces-0.003
 Johnny-0.003
Token was
Token ID373
Feature activation-1.12e-01

Pos logit contributions
rou+0.017
run+0.016
arl+0.015
om+0.014
o+0.014
Neg logit contributions
ured-0.015
 exercises-0.014
nic-0.014
 poses-0.014
seek-0.013
Token the
Token ID262
Feature activation+2.32e+00

Pos logit contributions
 exercises+0.003
ured+0.003
 therapist+0.002
 poses+0.002
+0.002
Neg logit contributions
arl-0.003
rou-0.002
run-0.002
ounces-0.002
om-0.002
Token happiest
Token ID49414
Feature activation+2.10e+00

Pos logit contributions
arl+0.056
rou+0.056
run+0.050
joy+0.045
 upon+0.045
Neg logit contributions
ured-0.046
seek-0.044
zed-0.043
ves-0.043
-0.043
Token she
Token ID673
Feature activation+6.29e+00

Pos logit contributions
arl+0.045
rou+0.043
run+0.041
joy+0.035
om+0.034
Neg logit contributions
ured-0.049
 exercises-0.048
seek-0.047
 poses-0.046
-0.046
Token'd
Token ID1549
Feature activation+5.59e+00

Pos logit contributions
arl+0.134
rou+0.129
run+0.125
ounces+0.106
om+0.105
Neg logit contributions
 exercises-0.148
ured-0.144
 poses-0.143
 therapist-0.139
-0.136
Token ever
Token ID1683
Feature activation+2.89e+00

Pos logit contributions
arl+0.098
rou+0.093
run+0.092
om+0.074
ounces+0.073
Neg logit contributions
 exercises-0.150
ured-0.147
 poses-0.146
 therapist-0.142
-0.139
Token been
Token ID587
Feature activation+2.97e+00

Pos logit contributions
arl+0.064
rou+0.059
run+0.057
ounces+0.050
om+0.050
Neg logit contributions
 exercises-0.066
ured-0.066
 therapist-0.065
 poses-0.064
seek-0.064
Token.
Token ID13
Feature activation+5.38e-01

Pos logit contributions
arl+0.064
rou+0.056
run+0.053
ucker+0.048
ounces+0.048
Neg logit contributions
ured-0.071
 exercises-0.071
 therapist-0.070
 poses-0.069
seek-0.069
Token<|endoftext|>
Token ID50256
Feature activation+3.58e+00

Pos logit contributions
arl+0.011
rou+0.011
ounces+0.009
run+0.009
ucker+0.009
Neg logit contributions
 therapist-0.013
-0.012
-0.012
 exercises-0.012
zed-0.012
Token was
Token ID373
Feature activation+1.16e+00

Pos logit contributions
arl+0.036
rou+0.036
run+0.035
ounces+0.029
om+0.028
Neg logit contributions
 exercises-0.050
ured-0.049
 poses-0.049
 therapist-0.049
-0.048
Token much
Token ID881
Feature activation+2.75e+00

Pos logit contributions
rou+0.019
run+0.018
arl+0.017
 upon+0.014
ounces+0.014
Neg logit contributions
 poses-0.032
 exercises-0.031
 therapist-0.031
ured-0.031
seek-0.031
Token greater
Token ID3744
Feature activation+2.56e-01

Pos logit contributions
arl+0.061
rou+0.059
run+0.057
ounces+0.049
om+0.049
Neg logit contributions
 exercises-0.061
ured-0.061
 poses-0.060
 therapist-0.060
-0.058
Token than
Token ID621
Feature activation+2.03e+00

Pos logit contributions
rou+0.005
run+0.005
arl+0.005
joy+0.005
 upon+0.004
Neg logit contributions
ured-0.006
-0.006
 therapist-0.006
nic-0.006
 exercises-0.006
Token any
Token ID597
Feature activation-2.04e-01

Pos logit contributions
arl+0.054
rou+0.049
run+0.047
ucker+0.045
ounces+0.043
Neg logit contributions
 exercises-0.039
 poses-0.038
ured-0.037
 therapist-0.036
seek-0.036
Token she
Token ID673
Feature activation+6.06e+00

Pos logit contributions
 exercises+0.004
ured+0.004
 poses+0.004
 therapist+0.004
+0.004
Neg logit contributions
arl-0.005
rou-0.005
run-0.004
ounces-0.004
om-0.004
Token could
Token ID714
Feature activation+3.72e+00

Pos logit contributions
rou+0.126
run+0.124
arl+0.123
 upon+0.103
o+0.103
Neg logit contributions
 exercises-0.142
ured-0.137
 poses-0.136
 therapist-0.132
-0.127
Token ever
Token ID1683
Feature activation+1.30e+00

Pos logit contributions
run+0.060
arl+0.057
rou+0.056
 upon+0.050
o+0.044
Neg logit contributions
 therapist-0.099
 exercises-0.098
 poses-0.096
ured-0.094
nic-0.093
Token imagine
Token ID5967
Feature activation-4.08e-01

Pos logit contributions
arl+0.028
rou+0.026
run+0.026
ounces+0.022
om+0.022
Neg logit contributions
 exercises-0.030
ured-0.030
 poses-0.029
 therapist-0.029
seek-0.028
Token.
Token ID13
Feature activation+3.55e-01

Pos logit contributions
 therapist+0.012
ured+0.012
+0.012
 exercises+0.012
nic+0.012
Neg logit contributions
rou-0.005
run-0.005
arl-0.004
 upon-0.004
joy-0.003
Token She
Token ID1375
Feature activation+1.75e-01

Pos logit contributions
arl+0.009
rou+0.009
run+0.008
etchup+0.008
hin+0.008
Neg logit contributions
 therapist-0.007
 poses-0.007
seek-0.006
 exercises-0.006
ured-0.006
Token it
Token ID340
Feature activation-5.98e-01

Pos logit contributions
run+0.006
rou+0.006
arl+0.005
 Tim+0.005
 Johnny+0.005
Neg logit contributions
-0.008
 exercises-0.008
ured-0.008
 poses-0.008
 therapist-0.008
Token was
Token ID373
Feature activation+2.75e-01

Pos logit contributions
ured+0.016
+0.015
 exercises+0.015
seek+0.015
 poses+0.015
Neg logit contributions
rou-0.010
run-0.009
o-0.009
 Johnny-0.008
 Tim-0.008
Token the
Token ID262
Feature activation+2.07e+00

Pos logit contributions
arl+0.006
rou+0.005
run+0.005
ounces+0.004
om+0.004
Neg logit contributions
 exercises-0.007
ured-0.007
 poses-0.007
 therapist-0.007
-0.006
Token most
Token ID749
Feature activation+5.95e-01

Pos logit contributions
arl+0.057
rou+0.055
run+0.053
ounces+0.048
om+0.047
Neg logit contributions
 exercises-0.036
ured-0.035
 poses-0.034
 therapist-0.034
-0.033
Token fun
Token ID1257
Feature activation+1.25e+00

Pos logit contributions
rou+0.014
arl+0.014
run+0.013
om+0.012
ucker+0.011
Neg logit contributions
ured-0.012
 exercises-0.012
 therapist-0.012
 poses-0.012
-0.012
Token she
Token ID673
Feature activation+6.02e+00

Pos logit contributions
arl+0.026
rou+0.025
run+0.025
om+0.021
ounces+0.021
Neg logit contributions
ured-0.029
 exercises-0.028
 therapist-0.027
-0.027
 poses-0.027
Token ever
Token ID1683
Feature activation+2.89e+00

Pos logit contributions
rou+0.114
arl+0.114
run+0.111
o+0.092
 upon+0.091
Neg logit contributions
 exercises-0.152
ured-0.146
 poses-0.145
 therapist-0.139
seek-0.136
Token had
Token ID550
Feature activation+5.05e+00

Pos logit contributions
arl+0.068
rou+0.062
run+0.061
ounces+0.053
ucker+0.053
Neg logit contributions
ured-0.063
 exercises-0.063
 therapist-0.062
 poses-0.062
seek-0.061
Token.
Token ID13
Feature activation+2.20e-01

Pos logit contributions
rou+0.077
arl+0.077
run+0.074
ounces+0.062
om+0.061
Neg logit contributions
 exercises-0.144
ured-0.140
 therapist-0.139
 poses-0.137
-0.134
Token She
Token ID1375
Feature activation+1.34e-01

Pos logit contributions
arl+0.004
rou+0.004
run+0.004
ounces+0.003
om+0.003
Neg logit contributions
 exercises-0.006
ured-0.006
 poses-0.006
 therapist-0.005
-0.005
Token can
Token ID460
Feature activation-1.45e+00

Pos logit contributions
rou+0.003
run+0.003
arl+0.002
om+0.002
o+0.002
Neg logit contributions
ured-0.003
 exercises-0.003
-0.003
 poses-0.003
nic-0.003
Token the
Token ID262
Feature activation-2.21e-01

Pos logit contributions
 exercises+0.025
 poses+0.024
ured+0.023
ger+0.023
 therapist+0.022
Neg logit contributions
arl-0.021
rou-0.019
run-0.018
ucker-0.017
ounces-0.016
Token sky
Token ID6766
Feature activation+1.79e+00

Pos logit contributions
 exercises+0.004
ured+0.004
 poses+0.004
 therapist+0.004
+0.004
Neg logit contributions
arl-0.006
rou-0.005
run-0.005
ounces-0.005
om-0.005
Token,
Token ID11
Feature activation-3.97e-01

Pos logit contributions
rou+0.031
run+0.029
arl+0.029
om+0.026
 upon+0.026
Neg logit contributions
ured-0.048
 therapist-0.047
 exercises-0.047
 poses-0.046
nic-0.046
Token the
Token ID262
Feature activation+1.04e+00

Pos logit contributions
 exercises+0.010
ured+0.010
 poses+0.010
 therapist+0.010
+0.009
Neg logit contributions
arl-0.008
rou-0.007
run-0.007
ounces-0.006
om-0.006
Token dragon
Token ID10441
Feature activation+1.59e+00

Pos logit contributions
arl+0.027
run+0.026
rou+0.026
ounces+0.023
ucker+0.023
Neg logit contributions
 exercises-0.020
 therapist-0.019
 poses-0.019
ured-0.019
ger-0.018
Token ruled
Token ID8879
Feature activation+6.00e+00

Pos logit contributions
rou+0.034
run+0.033
arl+0.032
om+0.030
o+0.029
Neg logit contributions
 exercises-0.035
ured-0.035
 poses-0.034
 therapist-0.034
-0.033
Token over
Token ID625
Feature activation+2.55e-02

Pos logit contributions
arl+0.141
rou+0.132
run+0.132
ucker+0.119
ounces+0.116
Neg logit contributions
ured-0.126
ger-0.123
seek-0.122
 exercises-0.122
 therapist-0.121
Token the
Token ID262
Feature activation+1.31e+00

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ucker+0.000
ounces+0.000
Neg logit contributions
 exercises-0.001
 poses-0.001
ured-0.001
 therapist-0.001
ger-0.001
Token land
Token ID1956
Feature activation+2.39e+00

Pos logit contributions
arl+0.034
run+0.032
rou+0.031
ucker+0.028
ounces+0.028
Neg logit contributions
 exercises-0.026
 poses-0.025
ured-0.025
 therapist-0.025
ger-0.025
Token with
Token ID351
Feature activation+5.78e-02

Pos logit contributions
arl+0.045
rou+0.043
run+0.041
om+0.035
ounces+0.035
Neg logit contributions
 exercises-0.061
ured-0.061
 poses-0.060
 therapist-0.060
-0.058
Token might
Token ID1244
Feature activation+1.66e+00

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ucker+0.001
ounces+0.001
Neg logit contributions
 exercises-0.001
ured-0.001
 poses-0.001
 therapist-0.001
ger-0.001
Token was
Token ID373
Feature activation-1.30e-01

Pos logit contributions
rou+0.003
arl+0.003
run+0.003
ounces+0.002
om+0.002
Neg logit contributions
ured-0.003
 exercises-0.003
-0.003
 poses-0.003
ger-0.003
Token proud
Token ID6613
Feature activation+1.35e+00

Pos logit contributions
 exercises+0.003
ured+0.003
 therapist+0.003
 poses+0.003
+0.003
Neg logit contributions
arl-0.003
rou-0.002
run-0.002
ounces-0.002
om-0.002
Token of
Token ID286
Feature activation+2.55e+00

Pos logit contributions
arl+0.015
rou+0.014
run+0.013
ounces+0.009
om+0.009
Neg logit contributions
ured-0.046
 exercises-0.045
 poses-0.044
 therapist-0.044
-0.044
Token the
Token ID262
Feature activation+1.45e+00

Pos logit contributions
arl+0.062
rou+0.060
run+0.058
ounces+0.050
om+0.049
Neg logit contributions
ured-0.050
 exercises-0.050
 therapist-0.049
seek-0.049
 poses-0.049
Token clothes
Token ID8242
Feature activation+2.13e-01

Pos logit contributions
rou+0.026
arl+0.025
joy+0.023
 upon+0.021
om+0.021
Neg logit contributions
ured-0.037
seek-0.036
edly-0.033
 exercises-0.033
-0.033
Token she
Token ID673
Feature activation+5.94e+00

Pos logit contributions
arl+0.004
rou+0.004
run+0.003
om+0.003
ounces+0.003
Neg logit contributions
 exercises-0.006
ured-0.006
 poses-0.006
-0.006
seek-0.006
Token had
Token ID550
Feature activation+4.07e+00

Pos logit contributions
rou+0.113
run+0.110
arl+0.109
o+0.094
 upon+0.092
Neg logit contributions
 exercises-0.150
 poses-0.143
ured-0.143
-0.132
seek-0.131
Token worn
Token ID12666
Feature activation+3.54e-01

Pos logit contributions
arl+0.077
rou+0.076
run+0.074
om+0.063
 upon+0.062
Neg logit contributions
 exercises-0.102
 poses-0.098
ured-0.096
 therapist-0.095
-0.093
Token.
Token ID13
Feature activation+5.30e-01

Pos logit contributions
arl+0.006
rou+0.005
run+0.005
om+0.004
 upon+0.004
Neg logit contributions
 exercises-0.010
 poses-0.010
-0.010
seek-0.010
ured-0.010
Token<|endoftext|>
Token ID50256
Feature activation+3.57e+00

Pos logit contributions
arl+0.009
rou+0.009
run+0.008
ounces+0.007
ucker+0.006
Neg logit contributions
 therapist-0.015
 exercises-0.015
-0.014
 poses-0.014
ured-0.014
TokenOnce
Token ID7454
Feature activation+6.81e-01

Pos logit contributions
rou+0.090
arl+0.087
run+0.079
ucker+0.075
ounces+0.071
Neg logit contributions
ger-0.078
 therapist-0.077
gers-0.071
seek-0.070
 exercises-0.070
Token saw
Token ID2497
Feature activation+1.39e+00

Pos logit contributions
 exercises+0.013
 poses+0.013
ured+0.013
 therapist+0.013
seek+0.013
Neg logit contributions
rou-0.006
 upon-0.006
run-0.005
o-0.005
ounces-0.005
Token that
Token ID326
Feature activation+1.79e+00

Pos logit contributions
arl+0.035
run+0.031
rou+0.031
ucker+0.028
ounces+0.027
Neg logit contributions
 exercises-0.029
ured-0.029
 poses-0.028
ger-0.027
-0.027
Token the
Token ID262
Feature activation+1.03e+00

Pos logit contributions
arl+0.028
rou+0.027
run+0.025
ounces+0.021
om+0.021
Neg logit contributions
ured-0.052
 exercises-0.051
 poses-0.051
 therapist-0.050
-0.049
Token land
Token ID1956
Feature activation+1.51e+00

Pos logit contributions
arl+0.028
run+0.027
rou+0.027
ucker+0.024
ounces+0.023
Neg logit contributions
 exercises-0.018
 poses-0.018
ured-0.018
 therapist-0.017
ger-0.017
Token was
Token ID373
Feature activation+6.63e-02

Pos logit contributions
arl+0.036
rou+0.036
run+0.035
om+0.031
ounces+0.031
Neg logit contributions
 exercises-0.031
ured-0.030
 poses-0.030
 therapist-0.029
seek-0.029
Token ruled
Token ID8879
Feature activation+5.88e+00

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
om+0.001
Neg logit contributions
 exercises-0.002
ured-0.002
 poses-0.002
 therapist-0.002
seek-0.002
Token by
Token ID416
Feature activation-1.27e+00

Pos logit contributions
arl+0.154
run+0.149
rou+0.141
etchup+0.138
ucker+0.136
Neg logit contributions
ured-0.105
ger-0.104
seek-0.103
tered-0.097
 exercises-0.097
Token a
Token ID257
Feature activation+1.97e+00

Pos logit contributions
 exercises+0.030
ured+0.029
 poses+0.029
ger+0.028
seek+0.028
Neg logit contributions
arl-0.028
rou-0.025
run-0.025
ucker-0.023
ounces-0.022
Token mean
Token ID1612
Feature activation+4.23e-01

Pos logit contributions
arl+0.053
run+0.053
rou+0.048
ucker+0.045
etchup+0.042
Neg logit contributions
ger-0.036
 exercises-0.035
 therapist-0.035
nic-0.034
 poses-0.034
Token monster
Token ID9234
Feature activation+1.19e+00

Pos logit contributions
arl+0.011
run+0.011
rou+0.011
etchup+0.009
ucker+0.009
Neg logit contributions
ger-0.008
 exercises-0.007
 poses-0.007
ured-0.007
 therapist-0.007
Token.
Token ID13
Feature activation+7.11e-01

Pos logit contributions
arl+0.026
rou+0.023
run+0.022
ucker+0.020
ounces+0.019
Neg logit contributions
ured-0.028
 exercises-0.028
 poses-0.028
 therapist-0.028
ger-0.027
Token be
Token ID307
Feature activation+6.57e-01

Pos logit contributions
arl+0.031
rou+0.029
run+0.029
ounces+0.025
om+0.025
Neg logit contributions
 exercises-0.024
ured-0.024
 poses-0.024
 therapist-0.024
-0.023
Token careful
Token ID8161
Feature activation-1.89e+00

Pos logit contributions
rou+0.014
arl+0.014
run+0.013
ounces+0.011
om+0.011
Neg logit contributions
 exercises-0.015
ured-0.015
 poses-0.015
 therapist-0.015
ger-0.014
Token with
Token ID351
Feature activation-6.68e-02

Pos logit contributions
 therapist+0.045
ured+0.045
 exercises+0.044
 poses+0.044
nic+0.044
Neg logit contributions
arl-0.037
rou-0.037
run-0.036
om-0.030
o-0.028
Token the
Token ID262
Feature activation+1.32e+00

Pos logit contributions
 exercises+0.001
ured+0.001
 poses+0.001
 therapist+0.001
+0.001
Neg logit contributions
arl-0.002
rou-0.002
run-0.002
ounces-0.001
om-0.001
Token things
Token ID1243
Feature activation-3.09e-01

Pos logit contributions
arl+0.031
rou+0.030
run+0.029
om+0.026
ounces+0.025
Neg logit contributions
ured-0.028
 exercises-0.028
 therapist-0.027
 poses-0.027
seek-0.026
Token she
Token ID673
Feature activation+5.88e+00

Pos logit contributions
ured+0.006
 exercises+0.006
 poses+0.006
seek+0.006
+0.006
Neg logit contributions
arl-0.007
rou-0.007
run-0.007
om-0.006
ounces-0.006
Token loved
Token ID6151
Feature activation+3.87e+00

Pos logit contributions
arl+0.135
rou+0.133
run+0.130
ounces+0.112
o+0.111
Neg logit contributions
ured-0.127
 exercises-0.125
 poses-0.121
 therapist-0.120
ger-0.114
Token.
Token ID13
Feature activation+7.80e-01

Pos logit contributions
rou+0.056
arl+0.056
run+0.053
om+0.043
ounces+0.041
Neg logit contributions
ured-0.113
 exercises-0.111
 therapist-0.110
-0.108
 poses-0.108
Token<|endoftext|>
Token ID50256
Feature activation+3.34e+00

Pos logit contributions
arl+0.018
rou+0.016
run+0.016
arse+0.015
ounces+0.015
Neg logit contributions
Her-0.016
-0.016
 herself-0.015
 therapist-0.015
zed-0.015
TokenOnce
Token ID7454
Feature activation+6.35e-01

Pos logit contributions
rou+0.104
arl+0.096
run+0.091
ucker+0.086
ounces+0.082
Neg logit contributions
ger-0.054
 therapist-0.051
seek-0.050
ves-0.047
 middle-0.047
Token upon
Token ID2402
Feature activation+7.82e-01

Pos logit contributions
rou+0.011
arl+0.010
hin+0.010
run+0.009
ounces+0.009
Neg logit contributions
 particular-0.016
ves-0.015
Her-0.014
 exercises-0.014
-0.014
TokenFire
Token ID13543
Feature activation-1.58e+00

Pos logit contributions
arl+0.025
rou+0.024
run+0.022
ounces+0.020
ucker+0.020
Neg logit contributions
 exercises-0.019
ured-0.019
 poses-0.019
 therapist-0.019
seek-0.019
Token is
Token ID318
Feature activation+2.12e-01

Pos logit contributions
 exercises+0.034
 therapist+0.033
ured+0.033
 poses+0.033
seek+0.033
Neg logit contributions
arl-0.037
rou-0.034
run-0.033
om-0.030
ounces-0.029
Token not
Token ID407
Feature activation+2.61e+00

Pos logit contributions
arl+0.004
rou+0.004
run+0.004
om+0.003
ounces+0.003
Neg logit contributions
 therapist-0.005
 exercises-0.005
ured-0.005
 poses-0.005
seek-0.005
Token a
Token ID257
Feature activation+4.76e+00

Pos logit contributions
arl+0.062
rou+0.058
run+0.055
ucker+0.050
ounces+0.049
Neg logit contributions
 exercises-0.057
ured-0.055
 poses-0.054
nic-0.051
-0.051
Token toy
Token ID13373
Feature activation+6.22e-02

Pos logit contributions
arl+0.116
run+0.110
rou+0.110
ucker+0.097
ounces+0.094
Neg logit contributions
 exercises-0.097
 poses-0.095
ured-0.094
 therapist-0.094
ger-0.094
Token"
Token ID1
Feature activation+5.79e+00

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
om+0.001
ounces+0.001
Neg logit contributions
 exercises-0.002
 poses-0.002
ured-0.002
 therapist-0.002
seek-0.002
Token on
Token ID319
Feature activation-3.28e-01

Pos logit contributions
arl+0.127
rou+0.121
run+0.117
ounces+0.102
om+0.101
Neg logit contributions
 exercises-0.131
ured-0.130
 poses-0.127
 therapist-0.127
-0.124
Token the
Token ID262
Feature activation+9.85e-01

Pos logit contributions
 exercises+0.008
ured+0.008
 poses+0.008
 therapist+0.008
+0.008
Neg logit contributions
arl-0.006
rou-0.006
run-0.006
ounces-0.005
ucker-0.005
Token bottom
Token ID4220
Feature activation+8.70e-01

Pos logit contributions
arl+0.025
rou+0.024
run+0.023
ounces+0.020
om+0.020
Neg logit contributions
 exercises-0.019
ured-0.019
 poses-0.019
 therapist-0.019
-0.018
Token.
Token ID13
Feature activation+6.17e-01

Pos logit contributions
arl+0.017
rou+0.016
run+0.015
ounces+0.014
ucker+0.013
Neg logit contributions
 exercises-0.022
ured-0.021
 poses-0.021
ger-0.021
 therapist-0.021
Token They
Token ID1119
Feature activation+8.68e-02

Pos logit contributions
rou+0.016
arl+0.015
run+0.014
arse+0.013
ounces+0.013
Neg logit contributions
 poses-0.012
 exercises-0.012
seek-0.011
ured-0.011
edly-0.011
Token par
Token ID1582
Feature activation+1.91e+00

Pos logit contributions
arl+0.011
run+0.011
rou+0.010
ucker+0.009
ounces+0.009
Neg logit contributions
 therapist-0.009
ger-0.009
 exercises-0.009
ured-0.008
 poses-0.008
Tokenrot
Token ID10599
Feature activation+1.02e+00

Pos logit contributions
arl+0.056
run+0.052
rou+0.050
 Johnny+0.049
etchup+0.046
Neg logit contributions
ves-0.030
ured-0.028
seek-0.028
 therapist-0.026
nic-0.026
Token was
Token ID373
Feature activation+6.14e-01

Pos logit contributions
rou+0.023
arl+0.023
run+0.022
om+0.020
ounces+0.019
Neg logit contributions
ured-0.022
 exercises-0.021
 poses-0.021
 therapist-0.021
-0.020
Token the
Token ID262
Feature activation+2.90e+00

Pos logit contributions
arl+0.015
run+0.013
rou+0.013
ounces+0.012
ucker+0.012
Neg logit contributions
 exercises-0.013
ured-0.012
 therapist-0.012
 poses-0.012
-0.012
Token happiest
Token ID49414
Feature activation+2.24e+00

Pos logit contributions
arl+0.094
run+0.090
rou+0.087
ounces+0.081
ucker+0.080
Neg logit contributions
 exercises-0.040
 therapist-0.037
 poses-0.037
ured-0.037
ger-0.035
Token he
Token ID339
Feature activation+5.78e+00

Pos logit contributions
arl+0.059
rou+0.054
run+0.054
ounces+0.048
ucker+0.047
Neg logit contributions
 therapist-0.044
ger-0.044
 exercises-0.041
 poses-0.041
ured-0.041
Token had
Token ID550
Feature activation+5.70e+00

Pos logit contributions
arl+0.137
rou+0.130
run+0.127
ounces+0.111
om+0.110
Neg logit contributions
 exercises-0.122
ured-0.120
 poses-0.118
 therapist-0.117
-0.114
Token ever
Token ID1683
Feature activation+2.19e+00

Pos logit contributions
arl+0.128
rou+0.120
run+0.116
ounces+0.102
ucker+0.100
Neg logit contributions
 exercises-0.128
ured-0.127
 therapist-0.124
 poses-0.124
seek-0.121
Token been
Token ID587
Feature activation+2.85e+00

Pos logit contributions
arl+0.049
rou+0.044
run+0.042
ucker+0.037
ounces+0.037
Neg logit contributions
 exercises-0.051
ured-0.051
 poses-0.050
seek-0.050
 therapist-0.049
Token.
Token ID13
Feature activation+6.29e-01

Pos logit contributions
arl+0.068
rou+0.058
run+0.054
ucker+0.052
ounces+0.049
Neg logit contributions
ured-0.065
 exercises-0.063
seek-0.063
 therapist-0.062
ger-0.062
Token<|endoftext|>
Token ID50256
Feature activation+3.67e+00

Pos logit contributions
arl+0.015
rou+0.014
run+0.012
ounces+0.012
arse+0.011
Neg logit contributions
 therapist-0.013
-0.013
-0.013
 exercises-0.013
zed-0.013
Token and
Token ID290
Feature activation+2.77e-02

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
ounces+0.002
om+0.002
Neg logit contributions
 exercises-0.002
ured-0.002
 poses-0.002
 therapist-0.002
-0.002
Token mysterious
Token ID11428
Feature activation-2.54e-01

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
om+0.001
Neg logit contributions
 exercises-0.001
ured-0.001
 poses-0.001
 therapist-0.001
-0.001
Token land
Token ID1956
Feature activation+1.25e+00

Pos logit contributions
 exercises+0.005
ured+0.005
 poses+0.005
 therapist+0.005
+0.005
Neg logit contributions
arl-0.006
rou-0.006
run-0.006
ounces-0.005
om-0.005
Token that
Token ID326
Feature activation+2.58e+00

Pos logit contributions
arl+0.026
rou+0.025
run+0.024
om+0.021
ounces+0.021
Neg logit contributions
 exercises-0.030
ured-0.029
 poses-0.029
 therapist-0.029
-0.028
Token was
Token ID373
Feature activation+8.64e-01

Pos logit contributions
rou+0.054
arl+0.054
run+0.053
om+0.046
ounces+0.045
Neg logit contributions
ured-0.058
 exercises-0.057
 poses-0.056
 therapist-0.056
-0.055
Token ruled
Token ID8879
Feature activation+5.76e+00

Pos logit contributions
arl+0.018
rou+0.017
run+0.017
ounces+0.014
om+0.014
Neg logit contributions
 exercises-0.021
ured-0.020
 poses-0.020
 therapist-0.020
-0.019
Token by
Token ID416
Feature activation-1.23e+00

Pos logit contributions
arl+0.142
run+0.132
rou+0.132
ucker+0.124
ounces+0.124
Neg logit contributions
ured-0.115
ger-0.113
seek-0.110
 exercises-0.109
 therapist-0.106
Token a
Token ID257
Feature activation+2.54e+00

Pos logit contributions
 exercises+0.029
ured+0.028
 poses+0.028
ger+0.027
 therapist+0.026
Neg logit contributions
arl-0.028
rou-0.025
run-0.024
ucker-0.023
ounces-0.022
Token powerful
Token ID3665
Feature activation+1.75e+00

Pos logit contributions
arl+0.064
run+0.059
rou+0.057
ucker+0.053
ounces+0.051
Neg logit contributions
 exercises-0.051
 therapist-0.050
ger-0.049
ured-0.049
 poses-0.049
Token witch
Token ID16365
Feature activation+2.78e+00

Pos logit contributions
arl+0.042
rou+0.039
run+0.039
ounces+0.034
ucker+0.034
Neg logit contributions
 exercises-0.037
ured-0.036
 therapist-0.036
ger-0.036
 poses-0.036
Token.
Token ID13
Feature activation+6.97e-01

Pos logit contributions
arl+0.057
rou+0.052
run+0.050
ounces+0.043
ucker+0.043
Neg logit contributions
 exercises-0.068
ured-0.068
 therapist-0.067
 poses-0.067
ger-0.065
Token always
Token ID1464
Feature activation+1.78e+00

Pos logit contributions
 exercises+0.006
ured+0.006
 therapist+0.006
 poses+0.006
+0.005
Neg logit contributions
arl-0.005
rou-0.004
run-0.004
ounces-0.004
om-0.004
Token proud
Token ID6613
Feature activation+9.76e-01

Pos logit contributions
arl+0.040
rou+0.036
run+0.035
ounces+0.032
ucker+0.031
Neg logit contributions
 exercises-0.041
ured-0.040
 therapist-0.039
 poses-0.039
-0.038
Token of
Token ID286
Feature activation+2.33e+00

Pos logit contributions
arl+0.014
rou+0.012
run+0.012
ounces+0.009
om+0.009
Neg logit contributions
 exercises-0.030
ured-0.030
 poses-0.030
 therapist-0.029
-0.029
Token the
Token ID262
Feature activation+7.93e-01

Pos logit contributions
arl+0.064
rou+0.062
run+0.060
ounces+0.053
om+0.052
Neg logit contributions
 therapist-0.039
ured-0.038
 exercises-0.038
seek-0.037
 poses-0.037
Token things
Token ID1243
Feature activation-5.66e-01

Pos logit contributions
arl+0.018
rou+0.018
run+0.017
om+0.015
ounces+0.015
Neg logit contributions
ured-0.017
 exercises-0.016
seek-0.016
-0.016
 therapist-0.016
Token she
Token ID673
Feature activation+5.73e+00

Pos logit contributions
ured+0.012
 exercises+0.012
+0.011
 poses+0.011
seek+0.011
Neg logit contributions
rou-0.013
arl-0.013
run-0.013
 Johnny-0.011
 upon-0.011
Token could
Token ID714
Feature activation+2.70e+00

Pos logit contributions
rou+0.121
arl+0.118
run+0.115
o+0.098
 upon+0.096
Neg logit contributions
 exercises-0.134
ured-0.133
 poses-0.130
 therapist-0.126
seek-0.121
Token do
Token ID466
Feature activation-7.89e-01

Pos logit contributions
arl+0.050
rou+0.049
run+0.047
 Johnny+0.040
 upon+0.040
Neg logit contributions
 therapist-0.067
ured-0.065
nic-0.063
ger-0.063
gers-0.063
Token,
Token ID11
Feature activation+1.52e+00

Pos logit contributions
 therapist+0.023
ured+0.023
 exercises+0.022
nic+0.022
ger+0.022
Neg logit contributions
rou-0.011
arl-0.010
run-0.010
 upon-0.009
 Johnny-0.008
Token even
Token ID772
Feature activation-4.60e-01

Pos logit contributions
arl+0.036
rou+0.033
run+0.032
ounces+0.029
om+0.028
Neg logit contributions
 exercises-0.033
ured-0.032
 poses-0.032
 therapist-0.031
-0.030
Token when
Token ID618
Feature activation+1.16e+00

Pos logit contributions
 exercises+0.010
ured+0.010
 poses+0.010
 therapist+0.010
+0.010
Neg logit contributions
arl-0.010
rou-0.010
run-0.009
ounces-0.008
om-0.008
Tokenrot
Token ID10599
Feature activation+1.02e+00

Pos logit contributions
arl+0.056
run+0.052
rou+0.050
 Johnny+0.049
etchup+0.046
Neg logit contributions
ves-0.030
ured-0.028
seek-0.028
 therapist-0.026
nic-0.026
Token was
Token ID373
Feature activation+6.14e-01

Pos logit contributions
rou+0.023
arl+0.023
run+0.022
om+0.020
ounces+0.019
Neg logit contributions
ured-0.022
 exercises-0.021
 poses-0.021
 therapist-0.021
-0.020
Token the
Token ID262
Feature activation+2.90e+00

Pos logit contributions
arl+0.015
run+0.013
rou+0.013
ounces+0.012
ucker+0.012
Neg logit contributions
 exercises-0.013
ured-0.012
 therapist-0.012
 poses-0.012
-0.012
Token happiest
Token ID49414
Feature activation+2.24e+00

Pos logit contributions
arl+0.094
run+0.090
rou+0.087
ounces+0.081
ucker+0.080
Neg logit contributions
 exercises-0.040
 therapist-0.037
 poses-0.037
ured-0.037
ger-0.035
Token he
Token ID339
Feature activation+5.78e+00

Pos logit contributions
arl+0.059
rou+0.054
run+0.054
ounces+0.048
ucker+0.047
Neg logit contributions
 therapist-0.044
ger-0.044
 exercises-0.041
 poses-0.041
ured-0.041
Token had
Token ID550
Feature activation+5.70e+00

Pos logit contributions
arl+0.137
rou+0.130
run+0.127
ounces+0.111
om+0.110
Neg logit contributions
 exercises-0.122
ured-0.120
 poses-0.118
 therapist-0.117
-0.114
Token ever
Token ID1683
Feature activation+2.19e+00

Pos logit contributions
arl+0.128
rou+0.120
run+0.116
ounces+0.102
ucker+0.100
Neg logit contributions
 exercises-0.128
ured-0.127
 therapist-0.124
 poses-0.124
seek-0.121
Token been
Token ID587
Feature activation+2.85e+00

Pos logit contributions
arl+0.049
rou+0.044
run+0.042
ucker+0.037
ounces+0.037
Neg logit contributions
 exercises-0.051
ured-0.051
 poses-0.050
seek-0.050
 therapist-0.049
Token.
Token ID13
Feature activation+6.29e-01

Pos logit contributions
arl+0.068
rou+0.058
run+0.054
ucker+0.052
ounces+0.049
Neg logit contributions
ured-0.065
 exercises-0.063
seek-0.063
 therapist-0.062
ger-0.062
Token<|endoftext|>
Token ID50256
Feature activation+3.67e+00

Pos logit contributions
arl+0.015
rou+0.014
run+0.012
ounces+0.012
arse+0.011
Neg logit contributions
 therapist-0.013
-0.013
-0.013
 exercises-0.013
zed-0.013
TokenOnce
Token ID7454
Feature activation+5.82e-01

Pos logit contributions
arl+0.096
rou+0.096
run+0.085
ucker+0.082
ounces+0.077
Neg logit contributions
ger-0.077
 therapist-0.072
gers-0.068
 exercises-0.067
seek-0.066
Token was
Token ID373
Feature activation+3.62e-01

Pos logit contributions
ured+0.018
seek+0.018
 exercises+0.018
nic+0.017
 poses+0.017
Neg logit contributions
rou-0.017
run-0.015
arl-0.013
 upon-0.013
o-0.013
Token the
Token ID262
Feature activation+2.08e+00

Pos logit contributions
arl+0.007
rou+0.007
run+0.007
ounces+0.006
om+0.006
Neg logit contributions
 exercises-0.009
ured-0.009
 poses-0.009
 therapist-0.009
-0.008
Token most
Token ID749
Feature activation+7.72e-01

Pos logit contributions
arl+0.060
rou+0.058
run+0.057
ounces+0.052
om+0.051
Neg logit contributions
 exercises-0.034
ured-0.033
 therapist-0.032
 poses-0.032
-0.030
Token interesting
Token ID3499
Feature activation+5.00e-01

Pos logit contributions
arl+0.020
rou+0.020
run+0.019
om+0.017
ounces+0.016
Neg logit contributions
ured-0.014
 exercises-0.014
 poses-0.014
 therapist-0.014
seek-0.014
Token thing
Token ID1517
Feature activation-6.27e-01

Pos logit contributions
arl+0.011
rou+0.011
run+0.011
om+0.009
joy+0.009
Neg logit contributions
ured-0.011
seek-0.010
 exercises-0.010
-0.010
 poses-0.010
Token she
Token ID673
Feature activation+5.70e+00

Pos logit contributions
ured+0.015
 exercises+0.015
seek+0.014
+0.014
 poses+0.014
Neg logit contributions
run-0.013
rou-0.012
arl-0.012
o-0.010
 upon-0.010
Token had
Token ID550
Feature activation+5.27e+00

Pos logit contributions
rou+0.116
run+0.112
arl+0.111
 upon+0.096
o+0.094
Neg logit contributions
 exercises-0.137
ured-0.134
 poses-0.130
seek-0.126
 therapist-0.126
Token ever
Token ID1683
Feature activation+1.57e+00

Pos logit contributions
rou+0.085
arl+0.083
run+0.082
om+0.068
 upon+0.068
Neg logit contributions
 exercises-0.146
ured-0.142
 therapist-0.140
 poses-0.139
-0.135
Token seen
Token ID1775
Feature activation+2.60e+00

Pos logit contributions
arl+0.031
rou+0.029
run+0.028
ounces+0.024
om+0.024
Neg logit contributions
 exercises-0.039
ured-0.039
 poses-0.038
 therapist-0.038
-0.037
Token.
Token ID13
Feature activation+7.96e-01

Pos logit contributions
rou+0.044
arl+0.043
run+0.041
ounces+0.033
om+0.033
Neg logit contributions
ured-0.070
 exercises-0.070
 poses-0.068
 therapist-0.068
-0.067
Token<|endoftext|>
Token ID50256
Feature activation+3.94e+00

Pos logit contributions
arl+0.017
rou+0.016
ounces+0.014
run+0.014
ucker+0.013
Neg logit contributions
 therapist-0.018
-0.018
-0.018
 exercises-0.017
Her-0.017
Token was
Token ID373
Feature activation+2.20e-01

Pos logit contributions
ured+0.035
 therapist+0.034
+0.033
nic+0.033
 poses+0.033
Neg logit contributions
rou-0.028
run-0.025
 upon-0.023
arl-0.023
om-0.022
Token the
Token ID262
Feature activation+1.58e+00

Pos logit contributions
rou+0.004
arl+0.004
run+0.004
ounces+0.003
om+0.003
Neg logit contributions
ured-0.006
 therapist-0.006
 poses-0.006
 exercises-0.006
seek-0.006
Token most
Token ID749
Feature activation-1.12e-01

Pos logit contributions
arl+0.045
run+0.044
rou+0.042
ounces+0.039
ucker+0.038
Neg logit contributions
 exercises-0.027
ured-0.026
 therapist-0.025
 poses-0.025
ger-0.024
Token beautiful
Token ID4950
Feature activation+1.12e+00

Pos logit contributions
 exercises+0.002
ured+0.002
 poses+0.002
 therapist+0.002
+0.002
Neg logit contributions
arl-0.003
rou-0.003
run-0.003
om-0.002
ounces-0.002
Token horn
Token ID12718
Feature activation-5.92e-01

Pos logit contributions
arl+0.028
rou+0.027
run+0.026
om+0.024
ounces+0.023
Neg logit contributions
ured-0.022
 exercises-0.021
 poses-0.020
 therapist-0.020
seek-0.020
Token she
Token ID673
Feature activation+5.67e+00

Pos logit contributions
 exercises+0.017
 poses+0.017
ured+0.017
+0.016
seek+0.016
Neg logit contributions
rou-0.009
run-0.009
arl-0.009
om-0.007
ounces-0.007
Token'd
Token ID1549
Feature activation+4.54e+00

Pos logit contributions
rou+0.112
run+0.111
arl+0.110
 upon+0.094
o+0.093
Neg logit contributions
 exercises-0.140
ured-0.134
 poses-0.133
 therapist-0.126
seek-0.124
Token ever
Token ID1683
Feature activation+1.79e+00

Pos logit contributions
arl+0.067
run+0.066
rou+0.066
 upon+0.052
ounces+0.052
Neg logit contributions
 exercises-0.132
ured-0.128
 therapist-0.127
 poses-0.126
bees-0.121
Token seen
Token ID1775
Feature activation+2.40e+00

Pos logit contributions
arl+0.039
rou+0.035
run+0.033
ucker+0.029
ounces+0.029
Neg logit contributions
 exercises-0.042
ured-0.042
seek-0.042
 poses-0.042
 therapist-0.041
Token.
Token ID13
Feature activation+9.32e-02

Pos logit contributions
rou+0.038
arl+0.038
run+0.035
om+0.029
ounces+0.029
Neg logit contributions
 exercises-0.067
ured-0.067
 poses-0.065
 therapist-0.065
-0.064
Token
Token ID198
Feature activation+9.26e-01

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
ounces+0.002
hin+0.002
Neg logit contributions
 therapist-0.002
-0.002
 exercises-0.002
 poses-0.002
ured-0.002
Token away
Token ID1497
Feature activation-1.07e+00

Pos logit contributions
 therapist+0.010
seek+0.010
 poses+0.010
 exercises+0.010
nic+0.009
Neg logit contributions
arl-0.006
rou-0.006
run-0.006
 late-0.005
 upon-0.005
Token from
Token ID422
Feature activation+9.38e-01

Pos logit contributions
 exercises+0.027
ured+0.026
 therapist+0.026
 poses+0.026
+0.026
Neg logit contributions
arl-0.021
rou-0.020
run-0.019
om-0.016
ounces-0.016
Token any
Token ID597
Feature activation-7.48e-01

Pos logit contributions
arl+0.021
rou+0.020
run+0.019
ounces+0.017
om+0.017
Neg logit contributions
 exercises-0.021
ured-0.021
 poses-0.020
 therapist-0.020
-0.020
Token dangerous
Token ID4923
Feature activation-1.29e+00

Pos logit contributions
 exercises+0.016
ured+0.016
 poses+0.016
 therapist+0.016
+0.015
Neg logit contributions
arl-0.017
rou-0.017
run-0.016
ounces-0.014
om-0.014
Token things
Token ID1243
Feature activation-2.98e-01

Pos logit contributions
ured+0.030
 exercises+0.029
 therapist+0.029
seek+0.029
 poses+0.028
Neg logit contributions
arl-0.027
rou-0.026
run-0.025
om-0.022
ounces-0.022
Token she
Token ID673
Feature activation+5.66e+00

Pos logit contributions
 exercises+0.008
ured+0.008
 therapist+0.007
 poses+0.007
+0.007
Neg logit contributions
arl-0.005
run-0.005
rou-0.005
 upon-0.004
ounces-0.004
Token might
Token ID1244
Feature activation+3.04e+00

Pos logit contributions
rou+0.117
arl+0.114
run+0.112
o+0.098
 upon+0.098
Neg logit contributions
ured-0.132
 exercises-0.131
 poses-0.125
 therapist-0.125
ves-0.120
Token find
Token ID1064
Feature activation+2.12e+00

Pos logit contributions
arl+0.064
rou+0.059
run+0.058
ounces+0.049
om+0.049
Neg logit contributions
 therapist-0.073
 exercises-0.071
ured-0.070
 poses-0.069
nic-0.066
Token.
Token ID13
Feature activation+6.20e-01

Pos logit contributions
arl+0.030
run+0.029
rou+0.029
ounces+0.021
 Johnny+0.021
Neg logit contributions
 exercises-0.062
 therapist-0.062
ured-0.061
 poses-0.060
-0.060
Token<|endoftext|>
Token ID50256
Feature activation+3.54e+00

Pos logit contributions
arl+0.014
rou+0.012
ounces+0.011
arse+0.011
run+0.011
Neg logit contributions
 exercises-0.013
 therapist-0.013
-0.013
seek-0.013
Her-0.013
TokenOnce
Token ID7454
Feature activation+6.85e-01

Pos logit contributions
rou+0.090
arl+0.088
run+0.079
ucker+0.076
ounces+0.071
Neg logit contributions
ger-0.077
 therapist-0.074
seek-0.069
 exercises-0.068
gers-0.067
Token a
Token ID257
Feature activation+3.66e+00

Pos logit contributions
rou+0.020
arl+0.020
run+0.019
om+0.016
ounces+0.016
Neg logit contributions
 poses-0.027
 exercises-0.027
ured-0.027
 therapist-0.026
seek-0.026
Token lot
Token ID1256
Feature activation+1.75e+00

Pos logit contributions
arl+0.094
run+0.087
rou+0.086
ounces+0.077
ucker+0.077
Neg logit contributions
 exercises-0.071
 therapist-0.070
ured-0.069
 poses-0.068
ger-0.066
Token more
Token ID517
Feature activation+3.13e-02

Pos logit contributions
arl+0.021
rou+0.020
run+0.019
om+0.013
ounces+0.012
Neg logit contributions
 exercises-0.057
ured-0.057
-0.056
 therapist-0.056
 poses-0.056
Token magical
Token ID10883
Feature activation+1.06e-01

Pos logit contributions
rou+0.001
run+0.001
arl+0.001
 late+0.001
joy+0.001
Neg logit contributions
ured-0.001
-0.001
seek-0.001
 therapist-0.001
-0.001
Token than
Token ID621
Feature activation+1.90e+00

Pos logit contributions
run+0.002
rou+0.002
arl+0.002
joy+0.001
om+0.001
Neg logit contributions
-0.003
ured-0.003
 therapist-0.003
-0.003
Her-0.002
Token she
Token ID673
Feature activation+5.64e+00

Pos logit contributions
arl+0.049
rou+0.045
run+0.043
ucker+0.040
om+0.040
Neg logit contributions
 exercises-0.038
ured-0.037
 poses-0.037
 therapist-0.036
seek-0.034
Token first
Token ID717
Feature activation-1.31e-01

Pos logit contributions
rou+0.114
run+0.113
arl+0.105
 upon+0.093
o+0.093
Neg logit contributions
 exercises-0.133
ured-0.131
 poses-0.127
 therapist-0.122
-0.121
Token thought
Token ID1807
Feature activation+2.50e+00

Pos logit contributions
 exercises+0.004
ured+0.004
 poses+0.003
 therapist+0.003
+0.003
Neg logit contributions
arl-0.002
rou-0.002
run-0.002
om-0.002
ounces-0.002
Token.
Token ID13
Feature activation+1.22e-01

Pos logit contributions
run+0.039
arl+0.038
rou+0.037
om+0.028
 upon+0.028
Neg logit contributions
ured-0.071
 exercises-0.070
seek-0.068
 therapist-0.068
 poses-0.068
Token<|endoftext|>
Token ID50256
Feature activation+3.67e+00

Pos logit contributions
arl+0.003
rou+0.003
run+0.003
ounces+0.003
ucker+0.003
Neg logit contributions
 exercises-0.002
 therapist-0.002
ured-0.002
 poses-0.002
-0.002
TokenThe
Token ID464
Feature activation-6.92e-01

Pos logit contributions
rou+0.094
arl+0.093
run+0.082
ucker+0.080
ounces+0.076
Neg logit contributions
ger-0.079
 therapist-0.073
gers-0.069
 exercises-0.069
-0.068
Token is
Token ID318
Feature activation-2.69e-01

Pos logit contributions
ured+0.046
edly+0.043
 exercises+0.042
+0.042
seek+0.042
Neg logit contributions
arl-0.024
rou-0.023
om-0.021
etchup-0.019
run-0.019
Token the
Token ID262
Feature activation+2.30e+00

Pos logit contributions
 exercises+0.008
ured+0.008
 therapist+0.007
 poses+0.007
seek+0.007
Neg logit contributions
arl-0.004
rou-0.004
run-0.004
ounces-0.003
om-0.003
Token most
Token ID749
Feature activation+2.80e-01

Pos logit contributions
arl+0.059
rou+0.057
run+0.055
ounces+0.049
om+0.049
Neg logit contributions
 exercises-0.044
ured-0.043
 poses-0.042
 therapist-0.042
-0.041
Token beautiful
Token ID4950
Feature activation+1.48e+00

Pos logit contributions
rou+0.007
arl+0.007
run+0.006
om+0.006
ucker+0.006
Neg logit contributions
ured-0.006
 exercises-0.006
seek-0.006
 therapist-0.005
 poses-0.005
Token card
Token ID2657
Feature activation+1.15e+00

Pos logit contributions
arl+0.035
rou+0.032
run+0.031
om+0.028
etchup+0.028
Neg logit contributions
ured-0.031
 exercises-0.030
-0.030
seek-0.029
 poses-0.028
Token I
Token ID314
Feature activation+5.62e+00

Pos logit contributions
arl+0.015
rou+0.014
run+0.014
o+0.012
om+0.011
Neg logit contributions
 exercises-0.035
ured-0.035
-0.034
 poses-0.034
seek-0.033
Token ever
Token ID1683
Feature activation+1.69e+00

Pos logit contributions
arl+0.111
rou+0.104
run+0.102
ounces+0.086
om+0.086
Neg logit contributions
 exercises-0.143
ured-0.139
 poses-0.138
 therapist-0.135
-0.132
Token got
Token ID1392
Feature activation+3.29e+00

Pos logit contributions
arl+0.039
rou+0.037
run+0.036
ounces+0.032
om+0.031
Neg logit contributions
 exercises-0.036
ured-0.036
 poses-0.035
 therapist-0.035
-0.034
Token.
Token ID13
Feature activation+6.41e-01

Pos logit contributions
arl+0.068
rou+0.062
run+0.059
ucker+0.051
ounces+0.051
Neg logit contributions
ured-0.081
 exercises-0.078
 poses-0.077
 therapist-0.077
ger-0.076
Token You
Token ID921
Feature activation+1.46e-01

Pos logit contributions
arl+0.011
rou+0.011
run+0.010
ounces+0.009
om+0.008
Neg logit contributions
 exercises-0.017
ured-0.017
 poses-0.017
 therapist-0.017
-0.017
Token are
Token ID389
Feature activation-3.52e-01

Pos logit contributions
arl+0.003
rou+0.002
run+0.002
om+0.002
ounces+0.002
Neg logit contributions
 exercises-0.004
ured-0.004
 poses-0.004
 therapist-0.004
-0.004
Token daddy
Token ID35822
Feature activation+1.21e+00

Pos logit contributions
 exercises+0.001
ured+0.001
 poses+0.001
 therapist+0.001
+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
ounces-0.001
om-0.001
Token all
Token ID477
Feature activation+1.12e+00

Pos logit contributions
arl+0.027
rou+0.025
run+0.024
ounces+0.021
om+0.021
Neg logit contributions
 exercises-0.028
ured-0.027
 poses-0.027
 therapist-0.027
-0.026
Token about
Token ID546
Feature activation+1.48e+00

Pos logit contributions
arl+0.025
rou+0.024
run+0.023
ounces+0.020
om+0.020
Neg logit contributions
ured-0.025
 exercises-0.025
 therapist-0.024
 poses-0.024
seek-0.024
Token the
Token ID262
Feature activation+4.14e-01

Pos logit contributions
arl+0.039
rou+0.038
run+0.037
ounces+0.033
om+0.033
Neg logit contributions
 exercises-0.027
ured-0.027
 poses-0.026
 therapist-0.026
-0.025
Token fun
Token ID1257
Feature activation-4.53e-01

Pos logit contributions
arl+0.010
rou+0.009
run+0.009
ounces+0.008
om+0.008
Neg logit contributions
 exercises-0.009
ured-0.009
 poses-0.009
 therapist-0.009
-0.008
Token she
Token ID673
Feature activation+5.61e+00

Pos logit contributions
ured+0.010
seek+0.009
+0.009
 exercises+0.009
 therapist+0.009
Neg logit contributions
arl-0.010
rou-0.010
run-0.010
 Johnny-0.009
ounces-0.009
Token had
Token ID550
Feature activation+4.17e+00

Pos logit contributions
arl+0.131
rou+0.128
run+0.123
ounces+0.108
om+0.107
Neg logit contributions
 exercises-0.119
ured-0.118
 poses-0.115
 therapist-0.113
-0.109
Token.
Token ID13
Feature activation+9.80e-02

Pos logit contributions
arl+0.065
rou+0.064
run+0.060
ounces+0.051
om+0.051
Neg logit contributions
 exercises-0.119
ured-0.116
 poses-0.114
 therapist-0.114
-0.111
Token<|endoftext|>
Token ID50256
Feature activation+2.96e+00

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
ounces+0.002
ucker+0.001
Neg logit contributions
 exercises-0.002
 therapist-0.002
-0.002
seek-0.002
 poses-0.002
TokenOnce
Token ID7454
Feature activation+4.04e-01

Pos logit contributions
rou+0.086
arl+0.082
run+0.074
ucker+0.071
ounces+0.068
Neg logit contributions
seek-0.052
ger-0.051
ves-0.049
-0.047
zed-0.047
Token upon
Token ID2402
Feature activation+7.51e-01

Pos logit contributions
rou+0.007
arl+0.007
hin+0.006
run+0.006
ounces+0.005
Neg logit contributions
 particular-0.011
ves-0.010
 exercises-0.009
zed-0.009
Her-0.009
Token than
Token ID621
Feature activation+1.58e+00

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
om+0.001
ounces+0.001
Neg logit contributions
 exercises-0.002
ured-0.002
 therapist-0.002
 poses-0.002
-0.002
Token the
Token ID262
Feature activation+2.19e+00

Pos logit contributions
arl+0.033
rou+0.031
run+0.030
ounces+0.026
om+0.026
Neg logit contributions
 exercises-0.037
ured-0.037
 poses-0.036
 therapist-0.036
-0.035
Token other
Token ID584
Feature activation-5.54e-01

Pos logit contributions
arl+0.057
rou+0.055
run+0.054
ounces+0.048
om+0.047
Neg logit contributions
 exercises-0.041
ured-0.041
 therapist-0.040
 poses-0.040
ger-0.039
Token penn
Token ID22429
Feature activation-1.09e+00

Pos logit contributions
 exercises+0.012
ured+0.012
 poses+0.012
 therapist+0.011
seek+0.011
Neg logit contributions
arl-0.012
rou-0.012
run-0.012
om-0.010
ounces-0.010
Tokenies
Token ID444
Feature activation+1.98e+00

Pos logit contributions
 exercises+0.024
ured+0.024
 poses+0.024
 therapist+0.024
ger+0.023
Neg logit contributions
arl-0.025
rou-0.022
run-0.022
ounces-0.019
om-0.018
Token she
Token ID673
Feature activation+5.61e+00

Pos logit contributions
arl+0.031
rou+0.030
run+0.029
ounces+0.024
om+0.024
Neg logit contributions
 exercises-0.056
ured-0.055
 poses-0.054
 therapist-0.054
seek-0.053
Token had
Token ID550
Feature activation+3.52e+00

Pos logit contributions
rou+0.121
run+0.119
arl+0.118
ounces+0.103
o+0.101
Neg logit contributions
 exercises-0.129
ured-0.121
 poses-0.120
 therapist-0.119
-0.114
Token.
Token ID13
Feature activation+1.32e-01

Pos logit contributions
arl+0.066
rou+0.066
run+0.063
om+0.055
ounces+0.054
Neg logit contributions
 exercises-0.091
 therapist-0.087
 poses-0.086
ured-0.086
-0.084
Token She
Token ID1375
Feature activation-1.93e-01

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
ounces+0.002
om+0.002
Neg logit contributions
 exercises-0.004
ured-0.004
 therapist-0.004
 poses-0.004
-0.004
Token wondered
Token ID14028
Feature activation-3.26e-01

Pos logit contributions
 exercises+0.005
ured+0.004
seek+0.004
ger+0.004
 poses+0.004
Neg logit contributions
rou-0.004
run-0.004
om-0.003
o-0.003
ounces-0.003
Token what
Token ID644
Feature activation+3.43e-01

Pos logit contributions
 exercises+0.010
ured+0.010
 therapist+0.010
 poses+0.010
seek+0.009
Neg logit contributions
arl-0.005
rou-0.005
run-0.004
ounces-0.003
om-0.003
Token is
Token ID318
Feature activation-1.31e+00

Pos logit contributions
ured+0.032
seek+0.032
edly+0.030
nic+0.029
 therapist+0.029
Neg logit contributions
rou-0.015
ounces-0.014
arl-0.013
run-0.013
om-0.013
Token the
Token ID262
Feature activation+1.59e+00

Pos logit contributions
seek+0.035
 therapist+0.035
 exercises+0.034
ured+0.034
 poses+0.033
Neg logit contributions
arl-0.023
rou-0.023
run-0.022
om-0.019
ounces-0.019
Token most
Token ID749
Feature activation-9.82e-02

Pos logit contributions
arl+0.039
rou+0.037
run+0.033
ounces+0.032
om+0.032
Neg logit contributions
ured-0.032
 exercises-0.032
seek-0.031
-0.031
zed-0.030
Token beautiful
Token ID4950
Feature activation+1.04e+00

Pos logit contributions
ured+0.003
seek+0.003
 exercises+0.002
zed+0.002
+0.002
Neg logit contributions
rou-0.002
arl-0.002
run-0.002
ucker-0.001
om-0.001
Token rose
Token ID8278
Feature activation+5.09e-01

Pos logit contributions
arl+0.024
rou+0.023
run+0.021
 upon+0.020
om+0.020
Neg logit contributions
seek-0.022
ured-0.022
 exercises-0.020
-0.020
nic-0.020
Token she
Token ID673
Feature activation+5.61e+00

Pos logit contributions
rou+0.006
run+0.006
arl+0.006
o+0.005
om+0.005
Neg logit contributions
 exercises-0.015
-0.015
ured-0.015
seek-0.015
 sessions-0.015
Token has
Token ID468
Feature activation+3.92e+00

Pos logit contributions
rou+0.104
arl+0.103
run+0.102
 upon+0.087
ounces+0.087
Neg logit contributions
 exercises-0.141
ured-0.139
 therapist-0.130
 poses-0.130
seek-0.130
Token ever
Token ID1683
Feature activation+1.17e+00

Pos logit contributions
rou+0.057
arl+0.054
run+0.053
om+0.045
ounces+0.045
Neg logit contributions
 exercises-0.115
ured-0.111
-0.109
 therapist-0.109
 poses-0.106
Token seen
Token ID1775
Feature activation+1.97e+00

Pos logit contributions
arl+0.020
rou+0.019
run+0.018
ounces+0.015
om+0.015
Neg logit contributions
 exercises-0.032
ured-0.032
 therapist-0.032
 poses-0.031
-0.031
Token.
Token ID13
Feature activation-1.92e-01

Pos logit contributions
arl+0.029
rou+0.029
run+0.027
ounces+0.022
om+0.021
Neg logit contributions
ured-0.057
 exercises-0.057
 therapist-0.055
 poses-0.055
-0.055
Token She
Token ID1375
Feature activation-4.20e-02

Pos logit contributions
 exercises+0.005
ured+0.005
 poses+0.005
 therapist+0.005
seek+0.005
Neg logit contributions
arl-0.003
run-0.003
rou-0.003
ounces-0.002
om-0.002
Token number
Token ID1271
Feature activation-3.47e+00

Pos logit contributions
 exercises+0.011
ured+0.011
 poses+0.011
 therapist+0.011
+0.010
Neg logit contributions
arl-0.020
rou-0.020
run-0.019
ounces-0.017
ucker-0.017
Token.
Token ID13
Feature activation-8.14e-01

Pos logit contributions
 exercises+0.091
ured+0.090
 poses+0.088
 therapist+0.088
+0.086
Neg logit contributions
arl-0.064
rou-0.060
run-0.058
ounces-0.049
om-0.049
Token "
Token ID366
Feature activation-3.06e+00

Pos logit contributions
 exercises+0.018
 poses+0.018
+0.017
ger+0.017
ured+0.017
Neg logit contributions
rou-0.019
arl-0.018
run-0.015
arse-0.015
ounces-0.014
TokenI
Token ID40
Feature activation-3.55e-01

Pos logit contributions
 exercises+0.067
ured+0.067
 therapist+0.065
 poses+0.064
ger+0.063
Neg logit contributions
arl-0.069
run-0.064
rou-0.063
om-0.054
etchup-0.053
Token weigh
Token ID10164
Feature activation-3.19e+00

Pos logit contributions
 exercises+0.009
ured+0.009
 poses+0.008
 therapist+0.008
+0.008
Neg logit contributions
arl-0.007
rou-0.007
run-0.007
ounces-0.006
om-0.006
Token 40
Token ID2319
Feature activation-6.84e+00

Pos logit contributions
 exercises+0.083
ured+0.083
 therapist+0.082
 poses+0.081
seek+0.080
Neg logit contributions
arl-0.058
rou-0.055
run-0.054
om-0.045
ounces-0.045
Token pounds
Token ID8059
Feature activation-2.55e+00

Pos logit contributions
ured+0.170
 therapist+0.170
seek+0.164
 exercises+0.163
ger+0.162
Neg logit contributions
arl-0.129
rou-0.128
run-0.125
om-0.111
ounces-0.106
Token.
Token ID13
Feature activation-1.65e-01

Pos logit contributions
 exercises+0.076
ured+0.076
 poses+0.075
 therapist+0.075
+0.073
Neg logit contributions
arl-0.037
rou-0.034
run-0.032
om-0.026
ounces-0.026
Token That
Token ID1320
Feature activation-2.44e+00

Pos logit contributions
 exercises+0.004
ured+0.004
 poses+0.004
 therapist+0.004
+0.004
Neg logit contributions
arl-0.004
rou-0.003
run-0.003
ounces-0.003
ucker-0.003
Token's
Token ID338
Feature activation-9.48e-01

Pos logit contributions
ured+0.059
edly+0.056
seek+0.055
 therapist+0.055
+0.053
Neg logit contributions
rou-0.047
run-0.047
arl-0.046
om-0.044
o-0.041
Token a
Token ID257
Feature activation+1.67e+00

Pos logit contributions
 exercises+0.022
ured+0.022
 poses+0.021
 therapist+0.021
+0.021
Neg logit contributions
arl-0.021
rou-0.020
run-0.019
ounces-0.016
om-0.016
Token could
Token ID714
Feature activation-7.30e-02

Pos logit contributions
rou+0.026
run+0.025
arl+0.025
om+0.023
o+0.022
Neg logit contributions
ured-0.026
 exercises-0.025
 poses-0.024
 therapist-0.024
seek-0.023
Token only
Token ID691
Feature activation-1.93e+00

Pos logit contributions
ger+0.002
ured+0.002
edly+0.002
 therapist+0.002
cious+0.002
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
o-0.001
om-0.001
Token watch
Token ID2342
Feature activation-1.72e+00

Pos logit contributions
ured+0.050
 therapist+0.048
ger+0.048
+0.048
edly+0.047
Neg logit contributions
run-0.034
arl-0.033
rou-0.029
om-0.027
o-0.027
Token TV
Token ID3195
Feature activation-2.95e+00

Pos logit contributions
ured+0.039
 therapist+0.038
ger+0.038
 exercises+0.038
seek+0.037
Neg logit contributions
arl-0.036
rou-0.035
run-0.034
om-0.030
ounces-0.029
Token for
Token ID329
Feature activation-1.49e+00

Pos logit contributions
ured+0.084
+0.082
 therapist+0.082
ger+0.081
cious+0.079
Neg logit contributions
arl-0.045
rou-0.042
run-0.041
om-0.036
ounces-0.033
Token 30
Token ID1542
Feature activation-6.61e+00

Pos logit contributions
 exercises+0.030
ured+0.029
 poses+0.029
 therapist+0.028
seek+0.027
Neg logit contributions
arl-0.038
rou-0.036
run-0.035
ounces-0.031
ucker-0.031
Token minutes
Token ID2431
Feature activation-1.00e+00

Pos logit contributions
ured+0.164
 therapist+0.158
ger+0.156
+0.150
seek+0.149
Neg logit contributions
arl-0.125
run-0.124
rou-0.119
om-0.115
etchup-0.105
Token.
Token ID13
Feature activation-3.16e-01

Pos logit contributions
 exercises+0.025
ured+0.024
 poses+0.024
 therapist+0.024
seek+0.023
Neg logit contributions
arl-0.020
rou-0.019
run-0.018
ounces-0.016
ucker-0.015
Token Lily
Token ID20037
Feature activation-1.41e+00

Pos logit contributions
+0.008
 exercises+0.008
 therapist+0.008
ured+0.008
 poses+0.008
Neg logit contributions
rou-0.006
arl-0.006
run-0.005
hin-0.004
ounces-0.004
Token asked
Token ID1965
Feature activation+4.18e-02

Pos logit contributions
ured+0.038
ger+0.037
 exercises+0.036
 poses+0.036
seek+0.036
Neg logit contributions
rou-0.023
o-0.021
om-0.020
run-0.019
ucker-0.019
Token Mitt
Token ID16627
Feature activation-5.87e-01

Pos logit contributions
arl+0.001
run+0.001
rou+0.001
 Johnny+0.001
om+0.001
Neg logit contributions
ured-0.001
 poses-0.001
 exercises-0.001
seek-0.001
 therapist-0.001
Token named
Token ID3706
Feature activation+3.24e+00

Pos logit contributions
 exercises+0.009
ured+0.009
ger+0.008
 therapist+0.008
+0.008
Neg logit contributions
arl-0.011
rou-0.010
run-0.009
ounces-0.008
ucker-0.008
Token Jack
Token ID3619
Feature activation+1.90e-01

Pos logit contributions
rou+0.071
arl+0.068
run+0.059
ucker+0.056
ounces+0.056
Neg logit contributions
 exercises-0.073
seek-0.071
nic-0.071
-0.071
 counted-0.070
Token.
Token ID13
Feature activation+2.80e-01

Pos logit contributions
arl+0.005
rou+0.004
run+0.004
ounces+0.004
ucker+0.004
Neg logit contributions
-0.004
 exercises-0.004
ger-0.003
seek-0.003
 poses-0.003
Token He
Token ID679
Feature activation-9.63e-01

Pos logit contributions
arl+0.005
rou+0.005
run+0.005
ounces+0.004
om+0.004
Neg logit contributions
 exercises-0.007
-0.007
 poses-0.007
seek-0.007
ger-0.007
Token was
Token ID373
Feature activation-1.32e+00

Pos logit contributions
ured+0.024
 poses+0.022
 exercises+0.022
 therapist+0.021
 sessions+0.020
Neg logit contributions
rou-0.020
o-0.019
arl-0.018
om-0.017
run-0.017
Token 7
Token ID767
Feature activation-6.50e+00

Pos logit contributions
 exercises+0.024
ured+0.022
 therapist+0.021
 poses+0.020
+0.020
Neg logit contributions
arl-0.038
rou-0.033
run-0.033
ucker-0.031
ounces-0.030
Token years
Token ID812
Feature activation+3.12e+00

Pos logit contributions
ured+0.147
 therapist+0.142
 exercises+0.137
 poses+0.136
+0.135
Neg logit contributions
rou-0.145
run-0.138
arl-0.136
om-0.124
ounces-0.118
Token old
Token ID1468
Feature activation+5.88e-02

Pos logit contributions
arl+0.084
rou+0.083
run+0.081
ounces+0.076
ucker+0.072
Neg logit contributions
 exercises-0.051
 therapist-0.050
ured-0.049
ger-0.048
 poses-0.048
Token and
Token ID290
Feature activation+2.29e-01

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
om+0.001
Neg logit contributions
 exercises-0.002
ured-0.002
 poses-0.002
 therapist-0.001
-0.001
Token he
Token ID339
Feature activation-1.25e-01

Pos logit contributions
rou+0.005
run+0.005
arl+0.005
o+0.005
 Johnny+0.005
Neg logit contributions
ured-0.005
 poses-0.005
seek-0.005
 exercises-0.005
 therapist-0.005
Token wanted
Token ID2227
Feature activation+3.43e-01

Pos logit contributions
ured+0.003
 poses+0.003
 exercises+0.003
 therapist+0.003
 vows+0.003
Neg logit contributions
o-0.002
rou-0.002
run-0.002
om-0.002
 Tim-0.002
Token said
Token ID531
Feature activation+1.02e-01

Pos logit contributions
rou+0.002
run+0.002
om+0.001
arl+0.001
o+0.001
Neg logit contributions
 exercises-0.002
 poses-0.002
ured-0.002
seek-0.002
-0.002
Token 3
Token ID513
Feature activation-1.86e+00

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
om+0.002
 Johnny+0.001
Neg logit contributions
 exercises-0.003
 poses-0.002
seek-0.002
-0.002
ured-0.002
Token plus
Token ID5556
Feature activation-1.64e+00

Pos logit contributions
ured+0.043
 exercises+0.042
 therapist+0.042
 poses+0.041
seek+0.041
Neg logit contributions
run-0.037
arl-0.037
rou-0.037
om-0.032
ounces-0.030
Token 4
Token ID604
Feature activation-4.20e+00

Pos logit contributions
ured+0.039
 exercises+0.039
seek+0.038
ger+0.038
 therapist+0.038
Neg logit contributions
run-0.033
rou-0.032
arl-0.032
om-0.028
ounces-0.026
Token makes
Token ID1838
Feature activation-1.44e+00

Pos logit contributions
ured+0.101
 therapist+0.100
seek+0.099
+0.097
 exercises+0.096
Neg logit contributions
run-0.084
rou-0.081
arl-0.078
om-0.071
ounces-0.064
Token 7
Token ID767
Feature activation-6.43e+00

Pos logit contributions
 exercises+0.036
ured+0.035
 poses+0.035
 therapist+0.035
+0.034
Neg logit contributions
arl-0.029
rou-0.027
run-0.026
ounces-0.023
om-0.022
Token.
Token ID13
Feature activation+2.40e-02

Pos logit contributions
ured+0.148
 therapist+0.146
+0.142
seek+0.141
ger+0.140
Neg logit contributions
rou-0.133
run-0.131
arl-0.127
om-0.113
ounces-0.108
Token The
Token ID383
Feature activation-1.17e-01

Pos logit contributions
rou+0.000
arl+0.000
run+0.000
ounces+0.000
ucker+0.000
Neg logit contributions
 therapist-0.001
 exercises-0.001
 poses-0.001
ured-0.001
-0.001
Token little
Token ID1310
Feature activation+1.99e+00

Pos logit contributions
ured+0.003
 exercises+0.003
+0.003
 poses+0.003
seek+0.003
Neg logit contributions
arl-0.002
rou-0.002
run-0.002
om-0.002
ounces-0.002
Token girl
Token ID2576
Feature activation+3.01e-01

Pos logit contributions
arl+0.037
rou+0.036
run+0.035
om+0.029
ounces+0.029
Neg logit contributions
ured-0.051
 exercises-0.050
-0.049
 poses-0.049
seek-0.048
Token was
Token ID373
Feature activation-1.78e-01

Pos logit contributions
rou+0.006
run+0.006
om+0.006
o+0.005
 Johnny+0.005
Neg logit contributions
 exercises-0.007
ured-0.007
ger-0.007
-0.007
seek-0.007
Token
Token ID198
Feature activation+1.03e-01

Pos logit contributions
 exercises+0.020
 poses+0.019
ured+0.019
 therapist+0.019
+0.019
Neg logit contributions
rou-0.013
arl-0.012
run-0.011
ounces-0.009
ucker-0.009
Token
Token ID198
Feature activation-6.50e-01

Pos logit contributions
rou+0.002
arl+0.002
run+0.002
 bruised+0.002
joy+0.002
Neg logit contributions
ves-0.002
Then-0.002
-0.002
 poses-0.002
 exercises-0.002
Token"
Token ID1
Feature activation-2.30e+00

Pos logit contributions
 exercises+0.018
ured+0.017
 poses+0.017
seek+0.016
est+0.016
Neg logit contributions
arl-0.012
 upon-0.009
om-0.009
 late-0.009
rou-0.009
TokenI
Token ID40
Feature activation+3.45e-02

Pos logit contributions
 exercises+0.064
ured+0.063
zed+0.061
ger+0.060
+0.060
Neg logit contributions
arl-0.039
rou-0.032
run-0.031
etchup-0.029
om-0.027
Token have
Token ID423
Feature activation-4.66e-01

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
om+0.001
Neg logit contributions
 exercises-0.001
ured-0.001
 poses-0.001
 therapist-0.001
-0.001
Token six
Token ID2237
Feature activation-6.32e+00

Pos logit contributions
 exercises+0.011
ured+0.011
 poses+0.011
 therapist+0.011
+0.010
Neg logit contributions
arl-0.010
rou-0.009
run-0.009
ounces-0.008
ucker-0.008
Token blocks
Token ID7021
Feature activation-2.43e+00

Pos logit contributions
 therapist+0.177
ured+0.176
seek+0.167
 exercises+0.165
ger+0.164
Neg logit contributions
arl-0.103
run-0.100
rou-0.096
om-0.087
ounces-0.083
Token in
Token ID287
Feature activation-2.41e-01

Pos logit contributions
 exercises+0.059
ured+0.059
 poses+0.058
 therapist+0.057
seek+0.056
Neg logit contributions
arl-0.048
rou-0.046
run-0.044
ounces-0.038
ucker-0.037
Token my
Token ID616
Feature activation-2.22e-01

Pos logit contributions
 exercises+0.005
ured+0.005
 poses+0.005
+0.005
 therapist+0.005
Neg logit contributions
rou-0.005
arl-0.005
run-0.005
ucker-0.004
ounces-0.004
Token row
Token ID5752
Feature activation+4.71e-02

Pos logit contributions
 exercises+0.005
ured+0.005
 poses+0.005
 therapist+0.005
+0.005
Neg logit contributions
arl-0.004
rou-0.004
run-0.004
ounces-0.004
om-0.003
Token,"
Token ID553
Feature activation-1.96e-01

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
ucker+0.001
Neg logit contributions
 exercises-0.001
 poses-0.001
ured-0.001
ger-0.001
 therapist-0.001
Token.
Token ID13
Feature activation-5.93e-01

Pos logit contributions
+0.053
ured+0.052
 vows+0.052
cious+0.051
nic+0.051
Neg logit contributions
arl-0.017
rou-0.016
run-0.016
 asleep-0.012
om-0.012
Token
Token ID198
Feature activation-3.33e-02

Pos logit contributions
 exercises+0.019
ured+0.019
 poses+0.018
 therapist+0.018
ger+0.018
Neg logit contributions
arl-0.008
rou-0.007
run-0.007
om-0.005
ounces-0.005
Token
Token ID198
Feature activation-6.60e-01

Pos logit contributions
ves+0.001
Her+0.001
Then+0.001
zed+0.001
seek+0.001
Neg logit contributions
arl-0.001
rou-0.001
ucker-0.001
run-0.001
etchup-0.001
Token"
Token ID1
Feature activation-2.62e+00

Pos logit contributions
ger+0.015
ured+0.015
 exercises+0.015
 therapist+0.015
 poses+0.014
Neg logit contributions
arl-0.013
run-0.013
rou-0.012
 upon-0.011
-0.011
TokenAll
Token ID3237
Feature activation-3.16e-01

Pos logit contributions
ured+0.063
 exercises+0.063
 poses+0.060
ger+0.058
+0.058
Neg logit contributions
arl-0.053
rou-0.049
run-0.049
o-0.043
ounces-0.042
Tokenerg
Token ID6422
Feature activation-6.29e+00

Pos logit contributions
 exercises+0.008
 poses+0.007
 therapist+0.007
ger+0.007
nic+0.007
Neg logit contributions
arl-0.007
rou-0.006
run-0.006
ounces-0.006
ucker-0.005
Tokenies
Token ID444
Feature activation-1.77e+00

Pos logit contributions
 therapist+0.179
ured+0.173
seek+0.173
ger+0.167
 poses+0.167
Neg logit contributions
rou-0.100
arl-0.099
run-0.089
om-0.078
ucker-0.077
Token!"
Token ID2474
Feature activation-1.71e+00

Pos logit contributions
 exercises+0.043
ured+0.042
 poses+0.042
 therapist+0.042
+0.040
Neg logit contributions
arl-0.036
rou-0.034
run-0.033
ounces-0.029
om-0.028
Token shouted
Token ID17293
Feature activation-2.35e+00

Pos logit contributions
 exercises+0.045
ured+0.044
ger+0.043
+0.043
 poses+0.042
Neg logit contributions
run-0.026
 Tim-0.024
arl-0.023
 Johnny-0.023
om-0.023
Token Anne
Token ID15397
Feature activation-1.73e+00

Pos logit contributions
 exercises+0.064
ured+0.062
 poses+0.062
+0.061
 therapist+0.061
Neg logit contributions
arl-0.036
run-0.036
rou-0.033
om-0.032
 Johnny-0.030
Token.
Token ID13
Feature activation-4.31e-01

Pos logit contributions
 exercises+0.043
ured+0.043
 poses+0.042
 therapist+0.042
+0.041
Neg logit contributions
arl-0.034
rou-0.032
run-0.031
ounces-0.027
om-0.026
Token's
Token ID338
Feature activation-3.21e+00

Pos logit contributions
edly+0.019
ured+0.019
seek+0.018
 poses+0.018
 exercises+0.018
Neg logit contributions
arl-0.017
rou-0.016
run-0.015
 Johnny-0.014
om-0.014
Token set
Token ID900
Feature activation-1.22e+00

Pos logit contributions
 exercises+0.059
ured+0.058
 poses+0.056
 therapist+0.055
+0.054
Neg logit contributions
arl-0.086
rou-0.083
run-0.081
ounces-0.072
om-0.072
Token the
Token ID262
Feature activation-9.58e-01

Pos logit contributions
 exercises+0.028
ured+0.028
 poses+0.026
ves+0.026
zed+0.025
Neg logit contributions
arl-0.028
rou-0.028
run-0.026
ucker-0.023
joy-0.023
Token timer
Token ID19781
Feature activation-3.65e+00

Pos logit contributions
 exercises+0.021
ured+0.021
 poses+0.020
 therapist+0.020
seek+0.020
Neg logit contributions
arl-0.022
rou-0.021
run-0.020
ounces-0.017
om-0.017
Token for
Token ID329
Feature activation-2.05e+00

Pos logit contributions
 exercises+0.097
ured+0.097
 poses+0.095
 therapist+0.095
+0.092
Neg logit contributions
arl-0.066
rou-0.062
run-0.059
ounces-0.050
om-0.049
Token 30
Token ID1542
Feature activation-6.27e+00

Pos logit contributions
ured+0.048
 exercises+0.048
 poses+0.045
 therapist+0.045
+0.045
Neg logit contributions
arl-0.045
rou-0.044
run-0.041
ucker-0.037
ounces-0.036
Token minutes
Token ID2431
Feature activation-6.82e-01

Pos logit contributions
ured+0.138
 exercises+0.136
 therapist+0.135
seek+0.133
+0.131
Neg logit contributions
arl-0.138
rou-0.131
run-0.130
om-0.117
ounces-0.114
Token,"
Token ID553
Feature activation+1.22e-01

Pos logit contributions
ured+0.015
 exercises+0.015
 poses+0.014
 therapist+0.014
seek+0.014
Neg logit contributions
arl-0.016
rou-0.015
run-0.014
ounces-0.012
ucker-0.012
Token dad
Token ID9955
Feature activation+1.30e+00

Pos logit contributions
arl+0.004
rou+0.003
run+0.003
ounces+0.003
om+0.003
Neg logit contributions
 therapist-0.002
 exercises-0.002
ured-0.002
 poses-0.002
ger-0.002
Token said
Token ID531
Feature activation+1.86e+00

Pos logit contributions
arl+0.028
run+0.027
rou+0.027
om+0.023
ounces+0.023
Neg logit contributions
 exercises-0.030
ured-0.029
 poses-0.029
 therapist-0.028
-0.028
Token.
Token ID13
Feature activation+4.86e-01

Pos logit contributions
arl+0.044
rou+0.042
run+0.036
ounces+0.036
ucker+0.034
Neg logit contributions
ured-0.042
 exercises-0.040
 poses-0.039
 therapist-0.039
seek-0.038
Token
Token ID220
Feature activation-1.37e+00

Pos logit contributions
 exercises+0.001
 poses+0.001
ured+0.001
+0.001
seek+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.000
ounces-0.000
om-0.000
Token
Token ID198
Feature activation+7.00e-01

Pos logit contributions
+0.027
ves+0.027
Her+0.025
Then+0.024
seek+0.024
Neg logit contributions
rou-0.034
arl-0.032
joy-0.028
 bruised-0.027
ounces-0.027
Token
Token ID198
Feature activation+8.48e-02

Pos logit contributions
arl+0.013
rou+0.013
run+0.011
ucker+0.010
ounces+0.010
Neg logit contributions
-0.018
seek-0.018
 poses-0.018
ves-0.018
 exercises-0.018
TokenMax
Token ID11518
Feature activation-2.46e+00

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
om+0.001
o+0.001
Neg logit contributions
ured-0.002
 exercises-0.002
 therapist-0.002
 poses-0.002
edly-0.002
Token and
Token ID290
Feature activation+4.69e-01

Pos logit contributions
ured+0.067
 exercises+0.066
 poses+0.065
 therapist+0.064
bees+0.061
Neg logit contributions
o-0.042
rou-0.040
om-0.038
run-0.038
 Johnny-0.036
Token Cole
Token ID11768
Feature activation-6.24e+00

Pos logit contributions
arl+0.010
rou+0.009
run+0.009
ounces+0.008
om+0.008
Neg logit contributions
ured-0.011
 exercises-0.010
 therapist-0.010
-0.010
 poses-0.010
Token loved
Token ID6151
Feature activation+2.16e+00

Pos logit contributions
 therapist+0.181
 poses+0.175
 exercises+0.172
ured+0.166
seek+0.158
Neg logit contributions
rou-0.091
om-0.086
run-0.085
o-0.085
 Tim-0.083
Token to
Token ID284
Feature activation-4.34e-01

Pos logit contributions
arl+0.046
rou+0.040
run+0.039
ucker+0.034
ounces+0.032
Neg logit contributions
 exercises-0.059
 poses-0.056
seek-0.052
-0.051
ured-0.051
Token play
Token ID711
Feature activation-1.44e+00

Pos logit contributions
 therapist+0.012
ured+0.012
edly+0.012
ger+0.011
 poses+0.011
Neg logit contributions
arl-0.007
rou-0.006
run-0.006
 Johnny-0.005
 Tim-0.005
Token outside
Token ID2354
Feature activation-1.01e+00

Pos logit contributions
 exercises+0.039
ured+0.039
 poses+0.038
 therapist+0.038
+0.037
Neg logit contributions
arl-0.025
rou-0.024
run-0.023
ounces-0.019
om-0.019
Token.
Token ID13
Feature activation-1.06e+00

Pos logit contributions
 exercises+0.024
 poses+0.023
ured+0.023
 therapist+0.023
ger+0.022
Neg logit contributions
arl-0.023
rou-0.021
run-0.020
ucker-0.018
ounces-0.017
Token clown
Token ID25573
Feature activation-2.45e+00

Pos logit contributions
ured+0.005
 exercises+0.005
+0.005
 therapist+0.005
 poses+0.005
Neg logit contributions
arl-0.003
rou-0.003
run-0.003
om-0.002
ounces-0.002
Tokens
Token ID82
Feature activation-6.54e-01

Pos logit contributions
 exercises+0.057
ured+0.056
 poses+0.055
 therapist+0.055
seek+0.054
Neg logit contributions
arl-0.054
rou-0.049
run-0.048
ounces-0.041
ucker-0.041
Token juggling
Token ID48097
Feature activation-2.38e+00

Pos logit contributions
 exercises+0.015
ured+0.015
 poses+0.015
 therapist+0.015
+0.014
Neg logit contributions
arl-0.014
rou-0.013
run-0.013
ounces-0.011
om-0.011
Token and
Token ID290
Feature activation+8.91e-01

Pos logit contributions
 exercises+0.058
ured+0.057
 therapist+0.056
 poses+0.056
+0.055
Neg logit contributions
arl-0.048
rou-0.046
run-0.044
ounces-0.038
om-0.038
Token ac
Token ID936
Feature activation+1.09e+00

Pos logit contributions
rou+0.016
arl+0.016
run+0.016
ounces+0.014
om+0.014
Neg logit contributions
ured-0.022
 exercises-0.022
 therapist-0.021
 poses-0.021
ger-0.021
Tokenrob
Token ID22609
Feature activation-6.10e+00

Pos logit contributions
arl+0.033
rou+0.032
run+0.031
ounces+0.028
om+0.028
Neg logit contributions
 exercises-0.016
ured-0.015
 poses-0.015
 therapist-0.015
seek-0.014
Tokenats
Token ID1381
Feature activation-1.33e+00

Pos logit contributions
ured+0.130
 therapist+0.130
 exercises+0.127
 poses+0.125
ger+0.124
Neg logit contributions
arl-0.141
rou-0.139
run-0.135
om-0.123
ounces-0.122
Token doing
Token ID1804
Feature activation-3.31e+00

Pos logit contributions
 exercises+0.035
ured+0.035
 therapist+0.034
 poses+0.034
+0.034
Neg logit contributions
arl-0.023
rou-0.022
run-0.022
om-0.018
ounces-0.018
Token flips
Token ID45971
Feature activation-1.57e+00

Pos logit contributions
 exercises+0.068
ured+0.068
 therapist+0.066
 poses+0.066
+0.064
Neg logit contributions
arl-0.079
rou-0.076
run-0.074
ounces-0.065
om-0.065
Token.
Token ID13
Feature activation-4.25e-01

Pos logit contributions
 exercises+0.040
ured+0.040
 therapist+0.039
 poses+0.039
+0.038
Neg logit contributions
arl-0.030
rou-0.028
run-0.027
om-0.023
ounces-0.023
Token But
Token ID887
Feature activation-1.95e+00

Pos logit contributions
 exercises+0.011
 therapist+0.011
ured+0.010
 poses+0.010
+0.010
Neg logit contributions
arl-0.009
rou-0.008
run-0.008
ounces-0.007
ucker-0.006
Token Lily
Token ID20037
Feature activation+2.15e-01

Pos logit contributions
 therapist+0.007
 exercises+0.007
ured+0.007
 poses+0.007
ger+0.007
Neg logit contributions
arl-0.004
rou-0.004
run-0.003
ounces-0.003
ucker-0.003
Token.
Token ID13
Feature activation-4.08e-01

Pos logit contributions
run+0.003
rou+0.003
arl+0.003
om+0.003
o+0.003
Neg logit contributions
 exercises-0.006
 poses-0.006
ured-0.006
nic-0.006
 therapist-0.006
Token "
Token ID366
Feature activation-1.64e+00

Pos logit contributions
+0.006
ured+0.006
 therapist+0.006
 poses+0.006
 exercises+0.005
Neg logit contributions
rou-0.012
arl-0.011
run-0.011
ounces-0.010
etchup-0.010
TokenI
Token ID40
Feature activation-4.38e-02

Pos logit contributions
 exercises+0.031
ured+0.031
 poses+0.030
 therapist+0.030
+0.029
Neg logit contributions
arl-0.042
rou-0.040
run-0.039
ounces-0.035
om-0.035
Token guess
Token ID4724
Feature activation-4.58e+00

Pos logit contributions
 exercises+0.001
ured+0.001
 poses+0.001
 therapist+0.001
+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
ounces-0.001
om-0.001
Token 30
Token ID1542
Feature activation-6.07e+00

Pos logit contributions
ured+0.133
 therapist+0.131
 exercises+0.130
seek+0.129
+0.127
Neg logit contributions
run-0.067
arl-0.067
rou-0.065
ounces-0.053
om-0.053
Token."
Token ID526
Feature activation-5.03e-01

Pos logit contributions
ured+0.150
 therapist+0.150
 exercises+0.146
ger+0.143
+0.142
Neg logit contributions
arl-0.115
rou-0.114
run-0.113
om-0.100
ounces-0.096
Token
Token ID198
Feature activation-1.05e-01

Pos logit contributions
ured+0.013
 therapist+0.013
 poses+0.013
 exercises+0.013
edly+0.013
Neg logit contributions
rou-0.009
arl-0.009
run-0.008
ounces-0.007
hin-0.007
Token
Token ID198
Feature activation-6.45e-01

Pos logit contributions
ves+0.002
Her+0.002
Then+0.002
+0.002
 therapist+0.002
Neg logit contributions
rou-0.002
arl-0.002
run-0.002
 bruised-0.002
joy-0.002
TokenThey
Token ID2990
Feature activation-1.75e+00

Pos logit contributions
 exercises+0.014
ured+0.014
 poses+0.014
 therapist+0.013
ger+0.013
Neg logit contributions
arl-0.015
rou-0.014
run-0.013
om-0.012
ounces-0.012
Token asked
Token ID1965
Feature activation+3.04e-02

Pos logit contributions
 exercises+0.039
 poses+0.038
 therapist+0.038
ured+0.037
+0.036
Neg logit contributions
rou-0.037
arl-0.036
run-0.036
ounces-0.033
om-0.032
Token points
Token ID2173
Feature activation-1.25e+00

Pos logit contributions
ured+0.043
 therapist+0.042
 exercises+0.041
+0.041
seek+0.040
Neg logit contributions
arl-0.044
run-0.042
rou-0.042
om-0.037
ounces-0.036
Token.
Token ID13
Feature activation-5.34e-01

Pos logit contributions
 exercises+0.032
ured+0.031
 poses+0.031
 therapist+0.030
ger+0.029
Neg logit contributions
arl-0.025
rou-0.023
run-0.021
ounces-0.019
ucker-0.019
Token When
Token ID1649
Feature activation-4.85e-01

Pos logit contributions
 exercises+0.011
 poses+0.010
ured+0.010
+0.010
seek+0.010
Neg logit contributions
arl-0.013
rou-0.013
run-0.011
ounces-0.011
ucker-0.011
Token he
Token ID339
Feature activation-7.56e-01

Pos logit contributions
 exercises+0.009
ured+0.009
 therapist+0.009
 poses+0.009
ger+0.009
Neg logit contributions
arl-0.013
rou-0.012
run-0.012
ounces-0.010
ucker-0.010
Token reached
Token ID4251
Feature activation-1.23e+00

Pos logit contributions
 exercises+0.013
ured+0.013
 therapist+0.013
 poses+0.013
+0.012
Neg logit contributions
arl-0.021
rou-0.020
run-0.019
ounces-0.017
om-0.017
Token fifteen
Token ID17280
Feature activation-6.05e+00

Pos logit contributions
 exercises+0.032
 poses+0.031
ured+0.031
 therapist+0.030
ger+0.030
Neg logit contributions
arl-0.025
rou-0.023
run-0.021
ucker-0.018
ounces-0.018
Token points
Token ID2173
Feature activation-1.79e+00

Pos logit contributions
ured+0.139
 therapist+0.137
 exercises+0.136
seek+0.133
+0.131
Neg logit contributions
arl-0.128
rou-0.122
run-0.121
om-0.106
ounces-0.105
Token,
Token ID11
Feature activation-3.03e-01

Pos logit contributions
 exercises+0.042
ured+0.041
 poses+0.040
 therapist+0.040
+0.039
Neg logit contributions
arl-0.039
rou-0.036
run-0.035
ounces-0.031
om-0.030
Token he
Token ID339
Feature activation-2.94e-01

Pos logit contributions
 exercises+0.006
 therapist+0.006
 poses+0.006
ured+0.006
ger+0.005
Neg logit contributions
arl-0.008
rou-0.008
run-0.007
ounces-0.007
ucker-0.007
Token shouted
Token ID17293
Feature activation-1.54e+00

Pos logit contributions
 exercises+0.006
ured+0.006
 poses+0.006
 therapist+0.006
seek+0.006
Neg logit contributions
rou-0.007
run-0.006
arl-0.006
om-0.006
ounces-0.006
Token,
Token ID11
Feature activation+1.64e-01

Pos logit contributions
 exercises+0.043
 therapist+0.042
 poses+0.042
seek+0.041
nic+0.041
Neg logit contributions
arl-0.025
run-0.024
rou-0.024
om-0.020
joy-0.019
Token I
Token ID314
Feature activation+1.76e+00

Pos logit contributions
 exercises+0.039
ured+0.038
 poses+0.037
 therapist+0.037
ger+0.036
Neg logit contributions
arl-0.035
rou-0.032
run-0.030
ucker-0.027
ounces-0.027
Token can
Token ID460
Feature activation+1.71e-02

Pos logit contributions
arl+0.025
rou+0.023
run+0.022
om+0.018
ounces+0.017
Neg logit contributions
ured-0.054
 therapist-0.054
 exercises-0.053
 poses-0.053
seek-0.051
Token!
Token ID0
Feature activation+4.46e-01

Pos logit contributions
arl+0.000
rou+0.000
run+0.000
om+0.000
o+0.000
Neg logit contributions
 therapist-0.000
ured-0.000
 poses-0.000
ger-0.000
 exercises-0.000
Token There
Token ID1318
Feature activation+1.32e+00

Pos logit contributions
arl+0.010
rou+0.010
run+0.009
ounces+0.008
ucker+0.008
Neg logit contributions
 exercises-0.010
ured-0.010
 therapist-0.010
 poses-0.010
-0.009
Token are
Token ID389
Feature activation-2.47e+00

Pos logit contributions
arl+0.027
rou+0.026
run+0.025
ounces+0.021
om+0.021
Neg logit contributions
ured-0.031
 exercises-0.031
 therapist-0.031
 poses-0.031
-0.030
Token 8
Token ID807
Feature activation-5.99e+00

Pos logit contributions
ured+0.065
 therapist+0.065
 exercises+0.063
+0.063
 poses+0.062
Neg logit contributions
arl-0.043
rou-0.042
run-0.041
ounces-0.034
om-0.034
Token36
Token ID2623
Feature activation-3.71e+00

Pos logit contributions
ured+0.138
 therapist+0.136
seek+0.131
ger+0.129
+0.127
Neg logit contributions
rou-0.126
arl-0.125
run-0.124
om-0.109
ounces-0.107
Token leaves
Token ID5667
Feature activation-9.77e-01

Pos logit contributions
 therapist+0.087
ured+0.081
seek+0.075
ger+0.074
+0.074
Neg logit contributions
arl-0.080
run-0.075
rou-0.074
om-0.068
ounces-0.066
Token on
Token ID319
Feature activation-1.18e+00

Pos logit contributions
 exercises+0.025
ured+0.025
 poses+0.025
 therapist+0.024
+0.024
Neg logit contributions
arl-0.018
rou-0.017
run-0.017
ounces-0.014
om-0.014
Token this
Token ID428
Feature activation-6.81e-01

Pos logit contributions
 exercises+0.029
ured+0.029
 poses+0.029
 therapist+0.028
+0.028
Neg logit contributions
arl-0.024
rou-0.023
run-0.022
ucker-0.019
ounces-0.019
Token tree
Token ID5509
Feature activation+1.12e+00

Pos logit contributions
ured+0.016
 exercises+0.016
 therapist+0.016
seek+0.016
+0.015
Neg logit contributions
arl-0.014
rou-0.013
run-0.012
om-0.011
ounces-0.011
Token the
Token ID262
Feature activation-4.88e-02

Pos logit contributions
arl+0.011
rou+0.011
run+0.010
ounces+0.009
ucker+0.009
Neg logit contributions
 exercises-0.008
 particular-0.008
ured-0.008
 therapist-0.008
 poses-0.008
Token oven
Token ID14361
Feature activation+7.68e-02

Pos logit contributions
 exercises+0.001
ured+0.001
 poses+0.001
 therapist+0.001
+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
ounces-0.001
om-0.001
Token.
Token ID13
Feature activation+2.51e-01

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
om+0.001
Neg logit contributions
 exercises-0.002
ured-0.002
 poses-0.002
 therapist-0.002
-0.002
Token After
Token ID2293
Feature activation-2.13e+00

Pos logit contributions
rou+0.006
arl+0.006
run+0.005
ounces+0.005
hin+0.005
Neg logit contributions
ured-0.005
-0.005
 therapist-0.005
 exercises-0.005
-0.005
Token about
Token ID546
Feature activation+8.14e-01

Pos logit contributions
 exercises+0.049
ured+0.048
 poses+0.047
 therapist+0.047
+0.046
Neg logit contributions
arl-0.047
rou-0.044
run-0.043
ounces-0.037
om-0.036
Token 15
Token ID1315
Feature activation-5.96e+00

Pos logit contributions
arl+0.017
rou+0.015
run+0.015
ounces+0.013
om+0.012
Neg logit contributions
 exercises-0.020
ured-0.020
 poses-0.020
 therapist-0.019
-0.019
Token minutes
Token ID2431
Feature activation-1.56e+00

Pos logit contributions
seek+0.149
ured+0.147
 therapist+0.144
+0.141
ger+0.140
Neg logit contributions
run-0.103
arl-0.101
rou-0.100
om-0.099
etchup-0.098
Token,
Token ID11
Feature activation-3.73e-01

Pos logit contributions
 exercises+0.030
ured+0.030
 therapist+0.029
 poses+0.029
ger+0.028
Neg logit contributions
arl-0.040
rou-0.038
run-0.036
ounces-0.032
ucker-0.032
Token she
Token ID673
Feature activation+2.37e-01

Pos logit contributions
 exercises+0.010
ured+0.010
 poses+0.010
 therapist+0.009
+0.009
Neg logit contributions
arl-0.007
rou-0.007
run-0.006
ounces-0.005
om-0.005
Token took
Token ID1718
Feature activation+3.19e-01

Pos logit contributions
rou+0.005
run+0.005
arl+0.005
om+0.005
ucker+0.004
Neg logit contributions
 exercises-0.005
seek-0.005
ured-0.005
 poses-0.005
ger-0.005
Token the
Token ID262
Feature activation-9.69e-01

Pos logit contributions
arl+0.007
rou+0.006
run+0.006
ounces+0.005
ucker+0.005
Neg logit contributions
 exercises-0.008
ured-0.008
 therapist-0.008
 poses-0.007
ger-0.007
Token.
Token ID13
Feature activation-7.72e-01

Pos logit contributions
ured+0.110
+0.107
 therapist+0.107
 exercises+0.106
nic+0.105
Neg logit contributions
run-0.055
arl-0.050
rou-0.050
om-0.041
ounces-0.040
Token She
Token ID1375
Feature activation-1.03e+00

Pos logit contributions
ves+0.020
 exercises+0.020
ured+0.019
ger+0.019
 poses+0.019
Neg logit contributions
run-0.013
arl-0.013
 Tim-0.011
 Johnny-0.011
rou-0.011
Token tried
Token ID3088
Feature activation-9.74e-01

Pos logit contributions
 poses+0.027
 exercises+0.027
ured+0.027
+0.026
nic+0.026
Neg logit contributions
rou-0.016
run-0.016
o-0.015
 Johnny-0.015
 Tim-0.015
Token counting
Token ID14143
Feature activation-4.08e+00

Pos logit contributions
ured+0.028
 exercises+0.028
 therapist+0.027
 poses+0.027
+0.027
Neg logit contributions
arl-0.015
rou-0.015
run-0.014
ounces-0.012
om-0.011
Token to
Token ID284
Feature activation-3.00e-01

Pos logit contributions
ured+0.106
 therapist+0.106
 exercises+0.104
+0.102
nic+0.102
Neg logit contributions
arl-0.073
run-0.071
rou-0.069
om-0.058
ounces-0.058
Token 10
Token ID838
Feature activation-5.95e+00

Pos logit contributions
 therapist+0.006
ured+0.006
 exercises+0.006
+0.006
ger+0.006
Neg logit contributions
arl-0.007
run-0.007
rou-0.006
ounces-0.006
om-0.006
Token but
Token ID475
Feature activation-2.77e-01

Pos logit contributions
ured+0.155
 therapist+0.154
 exercises+0.148
+0.147
seek+0.147
Neg logit contributions
run-0.107
arl-0.105
rou-0.098
om-0.090
ounces-0.087
Token then
Token ID788
Feature activation+2.15e-01

Pos logit contributions
nic+0.007
ured+0.007
+0.007
ger+0.007
seek+0.007
Neg logit contributions
run-0.005
 upon-0.005
rou-0.005
 Tim-0.005
 Johnny-0.004
Token she
Token ID673
Feature activation+3.82e-01

Pos logit contributions
 Johnny+0.004
run+0.004
rou+0.004
 Tim+0.004
 upon+0.004
Neg logit contributions
ured-0.005
-0.005
nic-0.005
 poses-0.004
seek-0.004
Token got
Token ID1392
Feature activation+6.53e-01

Pos logit contributions
rou+0.007
run+0.007
arl+0.007
om+0.006
ounces+0.006
Neg logit contributions
 exercises-0.010
ured-0.010
 poses-0.010
 therapist-0.009
nic-0.009
Token stuck
Token ID7819
Feature activation+9.88e-01

Pos logit contributions
arl+0.014
rou+0.012
run+0.012
ounces+0.010
om+0.010
Neg logit contributions
 exercises-0.016
ured-0.016
 poses-0.016
 therapist-0.016
-0.015
Token the
Token ID262
Feature activation-1.21e+00

Pos logit contributions
ured+0.042
 therapist+0.041
 exercises+0.040
 poses+0.040
seek+0.040
Neg logit contributions
arl-0.019
rou-0.017
run-0.017
om-0.014
ounces-0.013
Token floor
Token ID4314
Feature activation-6.89e-01

Pos logit contributions
ured+0.024
 therapist+0.024
+0.023
edly+0.023
 exercises+0.023
Neg logit contributions
rou-0.027
arl-0.027
run-0.024
om-0.024
 Johnny-0.023
Token.
Token ID13
Feature activation-3.29e-01

Pos logit contributions
ured+0.019
 therapist+0.019
 exercises+0.019
+0.019
nic+0.019
Neg logit contributions
rou-0.011
arl-0.011
run-0.010
om-0.009
ounces-0.008
Token She
Token ID1375
Feature activation-6.77e-01

Pos logit contributions
 exercises+0.008
 therapist+0.008
ured+0.007
 poses+0.007
ger+0.007
Neg logit contributions
arl-0.007
rou-0.007
run-0.007
om-0.006
ounces-0.006
Token had
Token ID550
Feature activation+2.36e-01

Pos logit contributions
ured+0.016
 exercises+0.015
 therapist+0.015
 poses+0.015
seek+0.015
Neg logit contributions
rou-0.014
run-0.014
arl-0.013
om-0.012
ounces-0.012
Token 4
Token ID604
Feature activation-5.94e+00

Pos logit contributions
arl+0.005
rou+0.004
run+0.004
ounces+0.004
om+0.004
Neg logit contributions
 exercises-0.006
ured-0.006
 poses-0.006
 therapist-0.006
-0.006
Token of
Token ID286
Feature activation+2.40e-01

Pos logit contributions
 therapist+0.140
ured+0.139
 exercises+0.128
seek+0.128
+0.128
Neg logit contributions
run-0.126
arl-0.117
rou-0.117
om-0.112
ounces-0.100
Token them
Token ID606
Feature activation-2.21e+00

Pos logit contributions
arl+0.006
rou+0.006
run+0.005
ucker+0.005
ounces+0.005
Neg logit contributions
 exercises-0.005
 poses-0.005
ured-0.005
-0.005
 therapist-0.005
Token,
Token ID11
Feature activation+6.14e-01

Pos logit contributions
 therapist+0.057
ured+0.057
 exercises+0.056
 poses+0.055
+0.055
Neg logit contributions
arl-0.040
rou-0.039
run-0.039
om-0.033
ounces-0.032
Token and
Token ID290
Feature activation-1.51e-01

Pos logit contributions
arl+0.012
rou+0.012
run+0.011
om+0.010
ounces+0.009
Neg logit contributions
ured-0.015
 exercises-0.015
 therapist-0.015
 poses-0.015
seek-0.015
Token felt
Token ID2936
Feature activation-6.36e-01

Pos logit contributions
ured+0.003
ger+0.003
seek+0.003
 therapist+0.003
 exercises+0.003
Neg logit contributions
rou-0.004
run-0.003
 Johnny-0.003
arl-0.003
 Tim-0.003
Token "
Token ID366
Feature activation-2.92e+00

Pos logit contributions
 exercises+0.023
ured+0.023
+0.023
 poses+0.023
 therapist+0.023
Neg logit contributions
rou-0.014
arl-0.014
run-0.012
ounces-0.010
ucker-0.009
TokenI
Token ID40
Feature activation-3.96e-01

Pos logit contributions
 exercises+0.069
ured+0.069
 therapist+0.067
 poses+0.066
ger+0.065
Neg logit contributions
arl-0.060
rou-0.056
run-0.055
om-0.047
o-0.047
Token got
Token ID1392
Feature activation-2.09e-01

Pos logit contributions
ured+0.011
 therapist+0.011
 exercises+0.010
 poses+0.010
edly+0.010
Neg logit contributions
arl-0.007
rou-0.007
run-0.007
om-0.006
ounces-0.005
Token to
Token ID284
Feature activation-8.05e-01

Pos logit contributions
 exercises+0.005
 poses+0.005
ured+0.005
 therapist+0.004
seek+0.004
Neg logit contributions
arl-0.005
rou-0.004
run-0.004
ucker-0.004
ounces-0.003
Token number
Token ID1271
Feature activation-4.15e+00

Pos logit contributions
ured+0.017
 therapist+0.017
 exercises+0.017
 poses+0.017
+0.016
Neg logit contributions
arl-0.018
rou-0.018
run-0.017
ounces-0.015
om-0.015
Token six
Token ID2237
Feature activation-5.93e+00

Pos logit contributions
 exercises+0.101
ured+0.101
 therapist+0.099
 poses+0.099
+0.096
Neg logit contributions
arl-0.083
rou-0.079
run-0.076
ounces-0.066
om-0.065
Token!"
Token ID2474
Feature activation-1.53e+00

Pos logit contributions
ured+0.146
 therapist+0.146
 exercises+0.141
seek+0.137
 poses+0.137
Neg logit contributions
arl-0.116
run-0.111
rou-0.110
om-0.098
ounces-0.096
Token Emma
Token ID18966
Feature activation-1.04e+00

Pos logit contributions
 exercises+0.043
ured+0.043
 poses+0.042
 therapist+0.041
seek+0.041
Neg logit contributions
arl-0.024
run-0.023
rou-0.022
om-0.018
ounces-0.018
Token said
Token ID531
Feature activation+1.14e-01

Pos logit contributions
 exercises+0.031
ured+0.030
 poses+0.030
zed+0.029
+0.029
Neg logit contributions
o-0.013
run-0.013
rou-0.013
om-0.013
 Johnny-0.012
Token.
Token ID13
Feature activation-1.90e-01

Pos logit contributions
arl+0.002
run+0.002
rou+0.002
om+0.001
ounces+0.001
Neg logit contributions
 exercises-0.003
 poses-0.003
ured-0.003
 therapist-0.003
-0.003
Token "
Token ID366
Feature activation-2.88e+00

Pos logit contributions
 exercises+0.005
 therapist+0.005
ured+0.005
+0.005
 poses+0.005
Neg logit contributions
rou-0.004
arl-0.004
run-0.003
ounces-0.003
om-0.003
Token the
Token ID262
Feature activation-1.10e-01

Pos logit contributions
arl+0.000
rou+0.000
run+0.000
om+0.000
ounces+0.000
Neg logit contributions
ured-0.001
 exercises-0.001
 poses-0.001
 therapist-0.001
seek-0.001
Token ground
Token ID2323
Feature activation+1.06e+00

Pos logit contributions
ured+0.002
 exercises+0.002
+0.002
 therapist+0.002
nic+0.002
Neg logit contributions
arl-0.002
rou-0.002
run-0.002
om-0.002
 Johnny-0.002
Token.
Token ID13
Feature activation-3.50e-01

Pos logit contributions
arl+0.017
rou+0.016
run+0.015
om+0.013
ounces+0.012
Neg logit contributions
 exercises-0.030
ured-0.030
 therapist-0.030
 poses-0.029
-0.029
Token It
Token ID632
Feature activation-1.39e+00

Pos logit contributions
 exercises+0.006
 therapist+0.006
+0.006
 poses+0.006
ured+0.006
Neg logit contributions
rou-0.010
arl-0.010
run-0.009
ounces-0.008
ucker-0.008
Token had
Token ID550
Feature activation+7.69e-01

Pos logit contributions
ured+0.046
 exercises+0.043
seek+0.043
 poses+0.043
nic+0.042
Neg logit contributions
 Tim-0.020
o-0.019
 Johnny-0.017
run-0.016
-0.015
Token six
Token ID2237
Feature activation-5.92e+00

Pos logit contributions
arl+0.013
rou+0.013
run+0.012
om+0.010
ounces+0.010
Neg logit contributions
 exercises-0.021
ured-0.021
 therapist-0.020
 poses-0.020
-0.020
Token legs
Token ID7405
Feature activation-8.34e-01

Pos logit contributions
 therapist+0.161
ured+0.154
seek+0.152
nic+0.151
ger+0.144
Neg logit contributions
run-0.108
om-0.107
ounces-0.093
arl-0.090
rou-0.089
Token and
Token ID290
Feature activation+5.51e-01

Pos logit contributions
 exercises+0.021
ured+0.020
 poses+0.020
 therapist+0.020
+0.019
Neg logit contributions
arl-0.017
rou-0.016
run-0.015
ounces-0.013
om-0.013
Token two
Token ID734
Feature activation-2.25e+00

Pos logit contributions
rou+0.007
run+0.007
arl+0.006
om+0.005
 Tim+0.005
Neg logit contributions
ured-0.017
-0.017
 exercises-0.017
 therapist-0.017
ger-0.017
Token feel
Token ID1254
Feature activation-8.55e-02

Pos logit contributions
 therapist+0.060
ured+0.060
 exercises+0.059
seek+0.059
nic+0.058
Neg logit contributions
arl-0.039
run-0.038
rou-0.038
om-0.033
ounces-0.031
Tokeners
Token ID364
Feature activation-2.50e-01

Pos logit contributions
seek+0.002
 exercises+0.002
 therapist+0.002
 poses+0.002
nic+0.002
Neg logit contributions
run-0.001
rou-0.001
arl-0.001
om-0.001
ounces-0.001
Token a
Token ID257
Feature activation+5.72e-01

Pos logit contributions
 exercises+0.029
 poses+0.028
ured+0.028
 therapist+0.027
+0.026
Neg logit contributions
arl-0.030
rou-0.028
run-0.027
ounces-0.023
joy-0.022
Token mess
Token ID2085
Feature activation+2.21e-01

Pos logit contributions
arl+0.013
run+0.013
rou+0.012
 Johnny+0.010
o+0.010
Neg logit contributions
 poses-0.013
 exercises-0.013
 therapist-0.012
ger-0.012
ured-0.012
Token.
Token ID13
Feature activation-2.86e-03

Pos logit contributions
arl+0.004
rou+0.004
run+0.004
om+0.003
ounces+0.003
Neg logit contributions
 exercises-0.006
ured-0.006
 therapist-0.006
 poses-0.006
-0.006
Token Lily
Token ID20037
Feature activation-6.62e-01

Pos logit contributions
+0.000
Her+0.000
ured+0.000
 exercises+0.000
Then+0.000
Neg logit contributions
rou-0.000
arl-0.000
hin-0.000
run-0.000
ounces-0.000
Token put
Token ID1234
Feature activation+6.73e-01

Pos logit contributions
 poses+0.016
 exercises+0.016
ured+0.016
nic+0.015
seek+0.015
Neg logit contributions
rou-0.012
run-0.011
o-0.011
om-0.011
ucker-0.010
Token six
Token ID2237
Feature activation-5.92e+00

Pos logit contributions
arl+0.014
rou+0.014
run+0.013
ounces+0.011
ucker+0.011
Neg logit contributions
 exercises-0.016
 therapist-0.016
ured-0.015
 poses-0.015
ger-0.015
Token jelly
Token ID27644
Feature activation-2.24e+00

Pos logit contributions
seek+0.147
ured+0.146
+0.140
 therapist+0.140
 exercises+0.139
Neg logit contributions
arl-0.111
run-0.110
rou-0.106
om-0.098
etchup-0.095
Token cups
Token ID14180
Feature activation-7.67e-01

Pos logit contributions
 exercises+0.062
+0.062
ured+0.061
seek+0.061
 poses+0.060
Neg logit contributions
arl-0.038
rou-0.034
run-0.033
om-0.028
etchup-0.027
Token in
Token ID287
Feature activation-5.52e-01

Pos logit contributions
ured+0.019
 exercises+0.018
 therapist+0.018
 poses+0.018
ger+0.018
Neg logit contributions
arl-0.015
rou-0.015
run-0.014
ounces-0.012
ucker-0.012
Token a
Token ID257
Feature activation+9.29e-01

Pos logit contributions
ured+0.011
 exercises+0.011
 therapist+0.011
 poses+0.011
 particular+0.010
Neg logit contributions
arl-0.014
rou-0.014
run-0.013
ounces-0.011
ucker-0.011
Token basket
Token ID7988
Feature activation+2.75e+00

Pos logit contributions
arl+0.018
rou+0.017
run+0.016
ounces+0.014
om+0.014
Neg logit contributions
 exercises-0.024
ured-0.023
 therapist-0.023
 poses-0.023
-0.022
Token,
Token ID11
Feature activation-6.82e-01

Pos logit contributions
 therapist+0.121
ured+0.119
seek+0.115
 exercises+0.115
+0.113
Neg logit contributions
run-0.073
arl-0.070
rou-0.069
om-0.061
ounces-0.059
Token six
Token ID2237
Feature activation-4.96e+00

Pos logit contributions
 therapist+0.016
 exercises+0.016
ured+0.016
 poses+0.015
seek+0.015
Neg logit contributions
arl-0.014
rou-0.013
run-0.013
om-0.012
ounces-0.011
Token.
Token ID13
Feature activation-8.24e-01

Pos logit contributions
 therapist+0.138
ured+0.136
 exercises+0.131
seek+0.131
+0.130
Neg logit contributions
run-0.081
arl-0.080
rou-0.077
om-0.068
ounces-0.066
Token She
Token ID1375
Feature activation-2.35e-01

Pos logit contributions
 exercises+0.019
 poses+0.019
ured+0.019
 therapist+0.018
+0.018
Neg logit contributions
arl-0.018
rou-0.018
run-0.016
ucker-0.014
ounces-0.014
Token has
Token ID468
Feature activation-2.11e-01

Pos logit contributions
ured+0.006
 therapist+0.005
edly+0.005
ger+0.005
 exercises+0.005
Neg logit contributions
rou-0.005
run-0.005
ounces-0.004
arl-0.004
om-0.004
Token six
Token ID2237
Feature activation-5.91e+00

Pos logit contributions
 therapist+0.006
ured+0.006
 exercises+0.005
ger+0.005
+0.005
Neg logit contributions
arl-0.004
rou-0.004
run-0.004
ounces-0.003
om-0.003
Token past
Token ID1613
Feature activation-2.78e+00

Pos logit contributions
 therapist+0.156
ured+0.146
seek+0.142
nic+0.139
ger+0.139
Neg logit contributions
run-0.117
arl-0.110
om-0.105
rou-0.102
ounces-0.099
Tokenels
Token ID1424
Feature activation-1.11e+00

Pos logit contributions
 therapist+0.062
ured+0.062
 exercises+0.062
 poses+0.060
seek+0.059
Neg logit contributions
arl-0.061
run-0.058
rou-0.057
om-0.051
ounces-0.050
Token.
Token ID13
Feature activation-3.23e-01

Pos logit contributions
 exercises+0.031
ured+0.030
 therapist+0.030
 poses+0.030
+0.030
Neg logit contributions
arl-0.019
rou-0.018
run-0.017
om-0.015
ounces-0.014
Token Four
Token ID6675
Feature activation-3.95e+00

Pos logit contributions
 exercises+0.007
 poses+0.007
ured+0.007
 therapist+0.007
+0.007
Neg logit contributions
arl-0.007
rou-0.007
run-0.007
ounces-0.006
ucker-0.006
Token are
Token ID389
Feature activation-1.02e+00

Pos logit contributions
 therapist+0.100
ured+0.097
 exercises+0.096
seek+0.096
ger+0.093
Neg logit contributions
arl-0.076
run-0.074
rou-0.071
om-0.067
ounces-0.064
Token two
Token ID734
Feature activation-2.23e+00

Pos logit contributions
 exercises+0.006
nic+0.006
ured+0.006
 therapist+0.005
cious+0.005
Neg logit contributions
arl-0.009
run-0.009
rou-0.009
ounces-0.008
ucker-0.008
Token brothers
Token ID9397
Feature activation-2.19e-01

Pos logit contributions
 exercises+0.055
 poses+0.052
ured+0.052
 therapist+0.051
bees+0.050
Neg logit contributions
arl-0.047
run-0.042
rou-0.042
ounces-0.036
ucker-0.034
Token named
Token ID3706
Feature activation+2.22e+00

Pos logit contributions
+0.004
 exercises+0.004
ured+0.004
ger+0.004
seek+0.003
Neg logit contributions
arl-0.006
rou-0.005
run-0.005
ucker-0.005
ounces-0.005
Token Max
Token ID5436
Feature activation-9.72e-01

Pos logit contributions
arl+0.042
rou+0.039
run+0.034
ucker+0.032
ounces+0.030
Neg logit contributions
 exercises-0.061
ured-0.057
 poses-0.056
nic-0.056
-0.056
Token and
Token ID290
Feature activation+4.89e-01

Pos logit contributions
+0.015
ger+0.015
ured+0.015
 exercises+0.015
seek+0.014
Neg logit contributions
arl-0.029
rou-0.025
run-0.024
ounces-0.023
ucker-0.023
Token Cole
Token ID11768
Feature activation-5.89e+00

Pos logit contributions
arl+0.007
rou+0.007
run+0.006
ounces+0.005
om+0.005
Neg logit contributions
 exercises-0.015
ured-0.014
 poses-0.014
 therapist-0.014
-0.014
Token.
Token ID13
Feature activation-3.09e-02

Pos logit contributions
 exercises+0.150
ured+0.149
 poses+0.146
 therapist+0.145
+0.143
Neg logit contributions
arl-0.115
rou-0.107
run-0.102
ounces-0.088
ucker-0.086
Token
Token ID220
Feature activation-1.37e+00

Pos logit contributions
 exercises+0.001
 poses+0.001
ured+0.001
+0.001
seek+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.000
ounces-0.000
om-0.000
Token
Token ID198
Feature activation+7.00e-01

Pos logit contributions
+0.027
ves+0.027
Her+0.025
Then+0.024
seek+0.024
Neg logit contributions
rou-0.034
arl-0.032
joy-0.028
 bruised-0.027
ounces-0.027
Token
Token ID198
Feature activation+8.48e-02

Pos logit contributions
arl+0.013
rou+0.013
run+0.011
ucker+0.010
ounces+0.010
Neg logit contributions
-0.018
seek-0.018
 poses-0.018
ves-0.018
 exercises-0.018
TokenMax
Token ID11518
Feature activation-2.46e+00

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
om+0.001
o+0.001
Neg logit contributions
ured-0.002
 exercises-0.002
 therapist-0.002
 poses-0.002
edly-0.002
Token He
Token ID679
Feature activation-6.16e-01

Pos logit contributions
arl+0.008
rou+0.008
run+0.008
ounces+0.007
ucker+0.007
Neg logit contributions
 exercises-0.006
-0.006
ger-0.006
 poses-0.005
 therapist-0.005
Token was
Token ID373
Feature activation-8.42e-01

Pos logit contributions
ured+0.016
 poses+0.014
 exercises+0.014
seek+0.014
nic+0.014
Neg logit contributions
rou-0.013
o-0.011
om-0.010
 upon-0.010
 Johnny-0.010
Token grateful
Token ID14066
Feature activation+2.52e+00

Pos logit contributions
 exercises+0.018
ured+0.018
 therapist+0.018
 poses+0.018
+0.017
Neg logit contributions
arl-0.019
rou-0.018
run-0.018
ounces-0.016
om-0.015
Token for
Token ID329
Feature activation+1.76e-01

Pos logit contributions
arl+0.050
rou+0.046
run+0.044
ounces+0.040
om+0.038
Neg logit contributions
 exercises-0.064
 poses-0.062
ured-0.062
 therapist-0.062
ger-0.061
Token everything
Token ID2279
Feature activation-1.70e-01

Pos logit contributions
arl+0.004
rou+0.004
run+0.004
ucker+0.004
om+0.004
Neg logit contributions
 exercises-0.004
ured-0.004
 poses-0.004
 therapist-0.003
-0.003
Token he
Token ID339
Feature activation+3.50e+00

Pos logit contributions
ured+0.004
 exercises+0.004
 poses+0.004
+0.004
 therapist+0.004
Neg logit contributions
rou-0.003
arl-0.003
run-0.003
 Johnny-0.003
o-0.003
Token had
Token ID550
Feature activation+3.52e+00

Pos logit contributions
rou+0.071
arl+0.069
run+0.068
o+0.060
 upon+0.058
Neg logit contributions
 exercises-0.084
ured-0.084
 poses-0.082
 therapist-0.079
-0.076
Token and
Token ID290
Feature activation+2.19e-02

Pos logit contributions
arl+0.068
rou+0.066
run+0.063
om+0.054
ounces+0.054
Neg logit contributions
 exercises-0.087
ured-0.086
 therapist-0.086
 poses-0.085
-0.082
Token everyone
Token ID2506
Feature activation-4.50e-01

Pos logit contributions
rou+0.000
arl+0.000
run+0.000
ounces+0.000
 Johnny+0.000
Neg logit contributions
ured-0.000
 exercises-0.000
 poses-0.000
-0.000
 therapist-0.000
Token he
Token ID339
Feature activation+2.64e+00

Pos logit contributions
ured+0.011
 exercises+0.010
 poses+0.010
 therapist+0.010
seek+0.010
Neg logit contributions
rou-0.010
run-0.009
arl-0.009
o-0.008
om-0.008
Token knew
Token ID2993
Feature activation+2.91e+00

Pos logit contributions
rou+0.060
run+0.055
arl+0.054
o+0.052
ounces+0.048
Neg logit contributions
ured-0.060
 exercises-0.058
 poses-0.057
 therapist-0.054
-0.053
Token be
Token ID307
Feature activation-3.13e-02

Pos logit contributions
arl+0.023
rou+0.022
run+0.021
ounces+0.018
ucker+0.018
Neg logit contributions
 therapist-0.022
ured-0.022
 poses-0.022
 exercises-0.022
ger-0.021
Token good
Token ID922
Feature activation+6.88e-01

Pos logit contributions
 exercises+0.001
ured+0.001
 poses+0.001
 therapist+0.001
+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
ounces-0.001
om-0.001
Token friends
Token ID2460
Feature activation+2.01e+00

Pos logit contributions
arl+0.015
rou+0.013
run+0.013
ounces+0.012
om+0.011
Neg logit contributions
 exercises-0.017
 poses-0.016
ured-0.016
 therapist-0.016
ger-0.016
Token again
Token ID757
Feature activation-4.75e-01

Pos logit contributions
rou+0.028
arl+0.027
run+0.027
om+0.020
o+0.020
Neg logit contributions
ured-0.059
 exercises-0.059
 poses-0.058
 therapist-0.057
-0.056
Token!
Token ID0
Feature activation+7.69e-01

Pos logit contributions
 exercises+0.012
ured+0.012
 therapist+0.012
 poses+0.012
+0.012
Neg logit contributions
arl-0.009
rou-0.008
run-0.008
ounces-0.007
om-0.007
Token<|endoftext|>
Token ID50256
Feature activation+3.61e+00

Pos logit contributions
arl+0.018
rou+0.016
run+0.014
ounces+0.014
ucker+0.013
Neg logit contributions
 exercises-0.017
-0.017
-0.017
 therapist-0.016
zed-0.016
TokenOnce
Token ID7454
Feature activation+7.88e-01

Pos logit contributions
rou+0.095
arl+0.094
run+0.085
ucker+0.081
ounces+0.076
Neg logit contributions
ger-0.074
 therapist-0.069
gers-0.066
-0.065
 exercises-0.065
Token there
Token ID612
Feature activation+6.55e-01

Pos logit contributions
arl+0.014
rou+0.013
hin+0.012
run+0.011
ounces+0.011
Neg logit contributions
 particular-0.021
ves-0.019
 exercises-0.018
-0.018
ger-0.018
Token was
Token ID373
Feature activation-5.15e-01

Pos logit contributions
arl+0.017
rou+0.017
run+0.016
ounces+0.015
om+0.015
Neg logit contributions
 exercises-0.012
ured-0.012
 poses-0.011
 therapist-0.011
-0.011
Token a
Token ID257
Feature activation+1.33e+00

Pos logit contributions
 exercises+0.012
ured+0.012
 poses+0.012
 therapist+0.012
+0.011
Neg logit contributions
arl-0.011
rou-0.010
run-0.010
ounces-0.009
ucker-0.009
Token girl
Token ID2576
Feature activation+8.92e-01

Pos logit contributions
run+0.032
arl+0.031
rou+0.028
ounces+0.026
ucker+0.025
Neg logit contributions
 therapist-0.028
ger-0.027
 exercises-0.026
-0.026
nic-0.026
Token and
Token ID290
Feature activation+8.00e-02

Pos logit contributions
arl+0.042
rou+0.039
run+0.038
ounces+0.036
ucker+0.034
Neg logit contributions
 exercises-0.036
ured-0.036
 poses-0.034
-0.033
 sessions-0.032
Token went
Token ID1816
Feature activation+2.17e+00

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
ounces+0.002
om+0.002
Neg logit contributions
 exercises-0.002
ured-0.002
 poses-0.002
 therapist-0.002
-0.001
Token home
Token ID1363
Feature activation-2.85e-01

Pos logit contributions
arl+0.038
rou+0.037
run+0.035
ucker+0.029
ounces+0.029
Neg logit contributions
ured-0.059
 exercises-0.059
 poses-0.057
 therapist-0.057
seek-0.057
Token happy
Token ID3772
Feature activation+5.45e-01

Pos logit contributions
ured+0.006
seek+0.006
 exercises+0.006
 poses+0.006
 therapist+0.006
Neg logit contributions
arl-0.007
rou-0.006
run-0.006
ucker-0.005
ounces-0.005
Token.
Token ID13
Feature activation+5.15e-01

Pos logit contributions
arl+0.010
rou+0.009
run+0.009
ounces+0.008
om+0.008
Neg logit contributions
 exercises-0.014
ured-0.014
 poses-0.014
 therapist-0.014
seek-0.014
Token<|endoftext|>
Token ID50256
Feature activation+3.41e+00

Pos logit contributions
arl+0.012
rou+0.011
run+0.011
ounces+0.010
ucker+0.009
Neg logit contributions
-0.012
zed-0.011
 exercises-0.011
ured-0.011
seek-0.011
TokenOnce
Token ID7454
Feature activation+6.72e-01

Pos logit contributions
rou+0.096
arl+0.093
run+0.083
ucker+0.080
 bruised+0.075
Neg logit contributions
ger-0.066
seek-0.062
 therapist-0.060
-0.058
gers-0.057
Token upon
Token ID2402
Feature activation+9.81e-01

Pos logit contributions
arl+0.012
rou+0.012
hin+0.010
run+0.010
ounces+0.009
Neg logit contributions
 particular-0.018
ves-0.016
 exercises-0.015
-0.015
ger-0.015
Token a
Token ID257
Feature activation+4.44e-01

Pos logit contributions
arl+0.028
rou+0.026
run+0.025
ucker+0.024
ounces+0.024
Neg logit contributions
 exercises-0.018
 poses-0.017
seek-0.017
zed-0.016
 therapist-0.016
Token time
Token ID640
Feature activation+1.28e-02

Pos logit contributions
arl+0.011
rou+0.011
run+0.010
ounces+0.009
ucker+0.009
Neg logit contributions
 exercises-0.009
 poses-0.009
-0.008
zed-0.008
nic-0.008
Token,
Token ID11
Feature activation+4.62e-01

Pos logit contributions
arl+0.000
rou+0.000
run+0.000
ounces+0.000
ucker+0.000
Neg logit contributions
 exercises-0.000
ured-0.000
 poses-0.000
-0.000
ger-0.000
Token Lily
Token ID20037
Feature activation+2.15e+00

Pos logit contributions
arl+0.041
rou+0.041
run+0.039
om+0.033
ounces+0.033
Neg logit contributions
ured-0.053
 exercises-0.052
 poses-0.051
-0.051
 therapist-0.050
Token's
Token ID338
Feature activation+6.01e-01

Pos logit contributions
rou+0.048
run+0.046
arl+0.043
o+0.042
om+0.041
Neg logit contributions
ured-0.049
 exercises-0.047
 poses-0.046
-0.045
seek-0.044
Token tooth
Token ID16162
Feature activation+8.06e-01

Pos logit contributions
arl+0.014
rou+0.013
run+0.013
ounces+0.011
om+0.011
Neg logit contributions
 exercises-0.013
ured-0.013
 poses-0.013
 therapist-0.012
-0.012
Token was
Token ID373
Feature activation+1.22e+00

Pos logit contributions
rou+0.018
run+0.018
arl+0.018
om+0.015
o+0.015
Neg logit contributions
ured-0.018
 exercises-0.017
-0.017
 poses-0.017
nic-0.017
Token too
Token ID1165
Feature activation-6.02e-01

Pos logit contributions
arl+0.025
rou+0.024
run+0.023
ounces+0.020
om+0.019
Neg logit contributions
 exercises-0.030
ured-0.029
 poses-0.029
 therapist-0.029
-0.028
Token damaged
Token ID9694
Feature activation+3.53e+00

Pos logit contributions
 exercises+0.014
 therapist+0.013
ured+0.013
 poses+0.013
ger+0.013
Neg logit contributions
arl-0.013
run-0.012
rou-0.012
ounces-0.010
ucker-0.010
Token and
Token ID290
Feature activation+1.66e+00

Pos logit contributions
arl+0.072
rou+0.068
run+0.065
ounces+0.057
ucker+0.056
Neg logit contributions
 exercises-0.086
ured-0.084
 poses-0.083
 therapist-0.083
ger-0.082
Token had
Token ID550
Feature activation+2.23e+00

Pos logit contributions
arl+0.042
rou+0.039
run+0.037
ounces+0.034
om+0.033
Neg logit contributions
 exercises-0.033
 therapist-0.032
ured-0.032
 poses-0.032
ger-0.031
Token to
Token ID284
Feature activation+8.54e-02

Pos logit contributions
arl+0.048
rou+0.041
run+0.039
ounces+0.035
ucker+0.035
Neg logit contributions
 exercises-0.055
 poses-0.052
ured-0.051
 therapist-0.051
seek-0.051
Token be
Token ID307
Feature activation-1.55e-01

Pos logit contributions
arl+0.002
rou+0.002
run+0.002
ounces+0.002
om+0.002
Neg logit contributions
 exercises-0.002
ured-0.002
 poses-0.002
 therapist-0.002
ger-0.001
Token removed
Token ID4615
Feature activation+7.32e-01

Pos logit contributions
 exercises+0.004
ured+0.004
 poses+0.003
 therapist+0.003
+0.003
Neg logit contributions
arl-0.003
rou-0.003
run-0.003
ounces-0.003
ucker-0.003
Token girl
Token ID2576
Feature activation+9.96e-01

Pos logit contributions
arl+0.074
rou+0.069
run+0.068
ounces+0.059
om+0.058
Neg logit contributions
 exercises-0.073
ured-0.072
 therapist-0.072
 poses-0.071
ger-0.069
Token started
Token ID2067
Feature activation+2.36e-01

Pos logit contributions
rou+0.020
run+0.019
arl+0.018
om+0.017
o+0.017
Neg logit contributions
 exercises-0.024
 poses-0.023
seek-0.023
ured-0.022
nic-0.022
Token to
Token ID284
Feature activation+4.43e-01

Pos logit contributions
arl+0.004
rou+0.004
run+0.003
ounces+0.003
ucker+0.003
Neg logit contributions
 exercises-0.007
ured-0.007
 poses-0.006
 therapist-0.006
seek-0.006
Token cry
Token ID3960
Feature activation+7.84e-01

Pos logit contributions
arl+0.009
rou+0.009
run+0.009
ounces+0.007
om+0.007
Neg logit contributions
 exercises-0.011
ured-0.010
 poses-0.010
 therapist-0.010
-0.010
Token.
Token ID13
Feature activation+5.46e-01

Pos logit contributions
arl+0.015
rou+0.015
run+0.014
ounces+0.012
om+0.012
Neg logit contributions
 exercises-0.020
ured-0.020
 poses-0.019
 therapist-0.019
-0.018
Token<|endoftext|>
Token ID50256
Feature activation+3.58e+00

Pos logit contributions
arl+0.010
rou+0.010
run+0.009
ounces+0.008
hin+0.007
Neg logit contributions
 therapist-0.014
-0.014
 exercises-0.014
ured-0.014
 poses-0.014
TokenOnce
Token ID7454
Feature activation+7.06e-01

Pos logit contributions
rou+0.098
arl+0.093
run+0.084
ucker+0.080
ounces+0.076
Neg logit contributions
ger-0.072
 therapist-0.068
seek-0.062
 exercises-0.062
gers-0.062
Token there
Token ID612
Feature activation+6.58e-01

Pos logit contributions
arl+0.013
rou+0.012
hin+0.011
run+0.010
ounces+0.010
Neg logit contributions
 particular-0.018
ves-0.016
 she-0.016
-0.016
 exercises-0.015
Token was
Token ID373
Feature activation-5.44e-01

Pos logit contributions
arl+0.019
rou+0.018
run+0.017
ounces+0.016
om+0.016
Neg logit contributions
 exercises-0.011
ured-0.011
 poses-0.010
 therapist-0.010
-0.010
Token a
Token ID257
Feature activation+1.28e+00

Pos logit contributions
 exercises+0.013
ured+0.013
 poses+0.012
 therapist+0.012
+0.012
Neg logit contributions
arl-0.012
rou-0.011
run-0.011
ounces-0.009
ucker-0.009
Token small
Token ID1402
Feature activation+1.50e+00

Pos logit contributions
arl+0.029
run+0.029
rou+0.026
ounces+0.024
ucker+0.022
Neg logit contributions
 therapist-0.029
ger-0.028
 exercises-0.027
-0.027
nic-0.027
Token the
Token ID262
Feature activation+1.06e+00

Pos logit contributions
rou+0.034
arl+0.032
ucker+0.031
ounces+0.028
o+0.027
Neg logit contributions
 exercises-0.023
 particular-0.023
ger-0.022
ured-0.021
 poses-0.021
Token eye
Token ID4151
Feature activation+2.34e+00

Pos logit contributions
arl+0.026
rou+0.026
run+0.025
ucker+0.024
ounces+0.023
Neg logit contributions
 exercises-0.021
ger-0.020
 poses-0.020
ured-0.020
 therapist-0.020
Token.
Token ID13
Feature activation-3.07e-01

Pos logit contributions
arl+0.044
rou+0.042
run+0.040
ounces+0.034
om+0.033
Neg logit contributions
 exercises-0.061
ured-0.060
 poses-0.059
 therapist-0.059
-0.057
Token He
Token ID679
Feature activation-5.39e-01

Pos logit contributions
ured+0.005
 exercises+0.005
+0.005
edly+0.005
ger+0.005
Neg logit contributions
rou-0.009
arl-0.009
ucker-0.007
ounces-0.007
run-0.007
Token showed
Token ID3751
Feature activation-2.23e-01

Pos logit contributions
 exercises+0.011
ured+0.011
 poses+0.011
 therapist+0.011
seek+0.011
Neg logit contributions
rou-0.012
run-0.012
arl-0.012
ounces-0.011
om-0.011
Token the
Token ID262
Feature activation+1.73e-02

Pos logit contributions
ured+0.005
 exercises+0.005
ger+0.005
 poses+0.005
 therapist+0.005
Neg logit contributions
arl-0.005
rou-0.005
run-0.004
ucker-0.004
ounces-0.004
Token monster
Token ID9234
Feature activation-5.24e-01

Pos logit contributions
arl+0.000
rou+0.000
run+0.000
ucker+0.000
ounces+0.000
Neg logit contributions
 exercises-0.000
 therapist-0.000
 poses-0.000
ger-0.000
ured-0.000
Token that
Token ID326
Feature activation+2.18e+00

Pos logit contributions
 therapist+0.016
ured+0.015
+0.015
 poses+0.015
 exercises+0.015
Neg logit contributions
run-0.008
rou-0.007
arl-0.007
o-0.006
om-0.006
Token he
Token ID339
Feature activation+1.28e+00

Pos logit contributions
run+0.042
arl+0.042
rou+0.042
ounces+0.035
 Johnny+0.035
Neg logit contributions
ured-0.054
-0.051
 exercises-0.051
 poses-0.050
 therapist-0.050
Token was
Token ID373
Feature activation-5.41e-02

Pos logit contributions
rou+0.029
run+0.029
arl+0.028
ounces+0.025
om+0.025
Neg logit contributions
 exercises-0.027
ured-0.027
 poses-0.027
 therapist-0.027
seek-0.026
Token brave
Token ID14802
Feature activation+1.33e+00

Pos logit contributions
 exercises+0.001
ured+0.001
 poses+0.001
 therapist+0.001
+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
ounces-0.001
ucker-0.001
Token.
Token ID13
Feature activation-9.36e-01

Pos logit contributions
 exercises+0.033
ured+0.032
 poses+0.032
 therapist+0.032
+0.031
Neg logit contributions
arl-0.022
rou-0.020
run-0.020
om-0.016
ounces-0.016
Token Her
Token ID2332
Feature activation-8.63e-02

Pos logit contributions
 exercises+0.017
 therapist+0.017
ured+0.017
 poses+0.017
+0.016
Neg logit contributions
arl-0.025
rou-0.024
run-0.022
ounces-0.020
ucker-0.020
Token mom
Token ID1995
Feature activation+1.14e+00

Pos logit contributions
 exercises+0.002
 poses+0.002
 therapist+0.002
ured+0.002
+0.002
Neg logit contributions
arl-0.002
rou-0.002
run-0.002
ounces-0.002
etchup-0.002
Token said
Token ID531
Feature activation+1.23e+00

Pos logit contributions
rou+0.023
run+0.022
arl+0.022
om+0.019
o+0.019
Neg logit contributions
 exercises-0.028
ured-0.027
 poses-0.027
seek-0.026
-0.026
Token no
Token ID645
Feature activation-2.48e-01

Pos logit contributions
arl+0.027
run+0.026
rou+0.026
om+0.023
 Johnny+0.022
Neg logit contributions
 exercises-0.027
 poses-0.027
seek-0.027
ured-0.026
-0.026
Token,
Token ID11
Feature activation+8.77e-01

Pos logit contributions
 exercises+0.006
ured+0.006
 poses+0.006
 therapist+0.006
+0.005
Neg logit contributions
arl-0.005
rou-0.005
run-0.005
ounces-0.004
om-0.004
Token it
Token ID340
Feature activation+7.39e-02

Pos logit contributions
arl+0.018
rou+0.017
run+0.017
ounces+0.015
om+0.014
Neg logit contributions
 exercises-0.021
ured-0.021
 poses-0.020
 therapist-0.020
-0.020
Token was
Token ID373
Feature activation+7.42e-01

Pos logit contributions
rou+0.002
run+0.001
o+0.001
om+0.001
arl+0.001
Neg logit contributions
ured-0.002
seek-0.002
 poses-0.002
 exercises-0.002
-0.002
Token too
Token ID1165
Feature activation-1.01e+00

Pos logit contributions
arl+0.013
rou+0.012
run+0.012
ounces+0.010
om+0.009
Neg logit contributions
 exercises-0.020
ured-0.020
 poses-0.020
 therapist-0.020
-0.019
Token dangerous
Token ID4923
Feature activation+2.63e-01

Pos logit contributions
 exercises+0.022
ger+0.022
ured+0.021
 therapist+0.021
 poses+0.021
Neg logit contributions
arl-0.021
run-0.020
rou-0.020
ounces-0.018
ucker-0.017
Token.
Token ID13
Feature activation+3.68e-01

Pos logit contributions
arl+0.004
rou+0.004
run+0.004
om+0.003
ounces+0.003
Neg logit contributions
 exercises-0.007
ured-0.007
 therapist-0.007
 poses-0.007
-0.007
Token the
Token ID262
Feature activation+3.60e-01

Pos logit contributions
 exercises+0.014
ured+0.013
 poses+0.013
ger+0.013
 therapist+0.013
Neg logit contributions
rou-0.013
arl-0.012
run-0.012
ucker-0.011
ounces-0.010
Token block
Token ID2512
Feature activation-4.17e-01

Pos logit contributions
rou+0.010
arl+0.010
run+0.010
ucker+0.009
ounces+0.009
Neg logit contributions
 exercises-0.006
 poses-0.005
ger-0.005
bees-0.005
ured-0.005
Token and
Token ID290
Feature activation+1.27e+00

Pos logit contributions
 exercises+0.009
ured+0.009
ger+0.009
 poses+0.009
 therapist+0.009
Neg logit contributions
arl-0.009
rou-0.009
run-0.008
ucker-0.007
ounces-0.007
Token break
Token ID2270
Feature activation-5.93e-01

Pos logit contributions
arl+0.029
rou+0.028
run+0.027
om+0.024
ucker+0.024
Neg logit contributions
 exercises-0.028
ured-0.027
 poses-0.027
 therapist-0.026
-0.025
Token my
Token ID616
Feature activation+5.41e-01

Pos logit contributions
 exercises+0.014
 poses+0.014
ured+0.013
ger+0.013
 therapist+0.013
Neg logit contributions
arl-0.013
rou-0.012
run-0.011
ucker-0.010
ounces-0.010
Token house
Token ID2156
Feature activation+1.43e+00

Pos logit contributions
rou+0.015
arl+0.015
ucker+0.013
run+0.013
ounces+0.012
Neg logit contributions
 exercises-0.010
 poses-0.010
ured-0.009
ger-0.009
est-0.008
Token.
Token ID13
Feature activation+1.62e-01

Pos logit contributions
arl+0.027
rou+0.026
run+0.025
ounces+0.022
ucker+0.021
Neg logit contributions
 exercises-0.036
ured-0.036
 poses-0.035
 therapist-0.035
ger-0.034
Token It
Token ID632
Feature activation-2.00e-01

Pos logit contributions
rou+0.003
arl+0.003
run+0.003
ounces+0.003
ucker+0.003
Neg logit contributions
ured-0.004
 exercises-0.004
 therapist-0.004
 poses-0.004
-0.004
Token will
Token ID481
Feature activation-9.31e-01

Pos logit contributions
 exercises+0.004
 poses+0.004
ured+0.004
 therapist+0.004
ger+0.004
Neg logit contributions
arl-0.005
rou-0.005
run-0.005
joy-0.004
ounces-0.004
Token be
Token ID307
Feature activation-9.53e-01

Pos logit contributions
 exercises+0.019
ured+0.019
 poses+0.018
 therapist+0.018
ger+0.018
Neg logit contributions
arl-0.022
rou-0.022
run-0.021
ounces-0.019
om-0.019
Token noisy
Token ID31210
Feature activation-1.17e+00

Pos logit contributions
 exercises+0.019
ured+0.018
+0.017
nic+0.017
 poses+0.016
Neg logit contributions
arl-0.024
rou-0.023
run-0.021
joy-0.020
ucker-0.020
Token and
Token ID290
Feature activation+3.98e-01

Pos logit contributions
arl+0.029
rou+0.027
run+0.026
ounces+0.023
om+0.023
Neg logit contributions
ured-0.034
 exercises-0.033
 poses-0.033
 therapist-0.032
-0.032
Token then
Token ID788
Feature activation+5.43e-01

Pos logit contributions
arl+0.008
rou+0.008
run+0.008
ounces+0.007
om+0.007
Neg logit contributions
 exercises-0.009
 poses-0.009
ured-0.009
 therapist-0.009
seek-0.009
Token it
Token ID340
Feature activation-3.45e-01

Pos logit contributions
run+0.010
rou+0.010
arl+0.009
 upon+0.008
ounces+0.008
Neg logit contributions
 exercises-0.013
 poses-0.013
seek-0.013
-0.013
ured-0.013
Token flew
Token ID13112
Feature activation+3.07e+00

Pos logit contributions
 exercises+0.008
seek+0.008
 poses+0.008
+0.007
ured+0.007
Neg logit contributions
run-0.007
rou-0.007
arl-0.006
o-0.006
 upon-0.006
Token away
Token ID1497
Feature activation+2.35e-02

Pos logit contributions
arl+0.065
rou+0.062
run+0.059
ounces+0.052
ucker+0.052
Neg logit contributions
ured-0.073
 exercises-0.072
ger-0.070
 poses-0.069
 therapist-0.069
Token.
Token ID13
Feature activation+7.18e-01

Pos logit contributions
arl+0.000
rou+0.000
run+0.000
ounces+0.000
om+0.000
Neg logit contributions
 exercises-0.001
ured-0.001
 poses-0.001
 therapist-0.001
-0.001
Token
Token ID220
Feature activation+4.36e-01

Pos logit contributions
arl+0.016
rou+0.016
run+0.013
ucker+0.012
ounces+0.012
Neg logit contributions
ured-0.016
-0.016
 exercises-0.016
 therapist-0.016
 poses-0.016
Token
Token ID198
Feature activation+1.20e+00

Pos logit contributions
rou+0.009
arl+0.009
run+0.007
joy+0.007
ounces+0.007
Neg logit contributions
-0.010
 exercises-0.010
ured-0.010
ves-0.010
seek-0.010
Token
Token ID198
Feature activation+3.00e-01

Pos logit contributions
rou+0.022
arl+0.022
run+0.018
ounces+0.017
ucker+0.016
Neg logit contributions
-0.031
Then-0.031
 exercises-0.030
ves-0.030
zed-0.030
TokenConf
Token ID18546
Feature activation-1.58e+00

Pos logit contributions
arl+0.006
run+0.006
rou+0.006
ounces+0.005
o+0.005
Neg logit contributions
 exercises-0.007
 poses-0.007
ured-0.007
 therapist-0.007
seek-0.007
Tokenused
Token ID1484
Feature activation+5.99e-01

Pos logit contributions
ured+0.023
 exercises+0.023
 therapist+0.022
 poses+0.022
+0.021
Neg logit contributions
arl-0.048
rou-0.046
run-0.045
ounces-0.041
om-0.041
Token he
Token ID339
Feature activation-1.30e-01

Pos logit contributions
arl+0.004
rou+0.003
run+0.003
om+0.003
ounces+0.003
Neg logit contributions
 exercises-0.002
 therapist-0.002
ger-0.002
ured-0.002
 poses-0.001
Token faced
Token ID7452
Feature activation-2.38e-01

Pos logit contributions
ured+0.003
 exercises+0.003
 poses+0.003
seek+0.003
 therapist+0.003
Neg logit contributions
rou-0.003
run-0.003
arl-0.003
o-0.003
ounces-0.003
Token it
Token ID340
Feature activation-9.93e-01

Pos logit contributions
ured+0.005
 exercises+0.005
seek+0.004
ger+0.004
edly+0.004
Neg logit contributions
arl-0.007
rou-0.006
ounces-0.005
ucker-0.005
run-0.005
Token with
Token ID351
Feature activation-6.28e-02

Pos logit contributions
 therapist+0.026
ured+0.026
 exercises+0.026
 poses+0.025
+0.025
Neg logit contributions
arl-0.018
rou-0.017
run-0.017
ounces-0.014
om-0.014
Token courage
Token ID11917
Feature activation+2.27e-01

Pos logit contributions
 exercises+0.001
ured+0.001
 poses+0.001
 therapist+0.001
ger+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
ucker-0.001
ounces-0.001
Token.
Token ID13
Feature activation+5.49e-01

Pos logit contributions
arl+0.005
rou+0.004
run+0.004
ounces+0.004
ucker+0.003
Neg logit contributions
ured-0.006
 exercises-0.006
 poses-0.006
 therapist-0.005
seek-0.005
Token The
Token ID383
Feature activation+1.06e-03

Pos logit contributions
arl+0.014
rou+0.013
ounces+0.013
run+0.011
om+0.011
Neg logit contributions
ger-0.010
 exercises-0.010
-0.010
-0.010
zed-0.010
Token moral
Token ID6573
Feature activation+6.78e-01

Pos logit contributions
arl+0.000
run+0.000
rou+0.000
ounces+0.000
ucker+0.000
Neg logit contributions
 exercises-0.000
ger-0.000
ured-0.000
 therapist-0.000
 poses-0.000
Token of
Token ID286
Feature activation+2.47e+00

Pos logit contributions
arl+0.010
rou+0.009
run+0.009
ounces+0.007
ucker+0.007
Neg logit contributions
 exercises-0.021
 therapist-0.020
 poses-0.020
ger-0.020
ured-0.020
Token the
Token ID262
Feature activation+6.84e-01

Pos logit contributions
arl+0.066
run+0.057
rou+0.056
ounces+0.054
om+0.051
Neg logit contributions
 exercises-0.051
zed-0.049
 therapist-0.048
 poses-0.047
ger-0.045
Token story
Token ID1621
Feature activation-2.21e-01

Pos logit contributions
arl+0.016
run+0.016
rou+0.015
ounces+0.014
om+0.013
Neg logit contributions
 exercises-0.015
 poses-0.014
ger-0.014
 therapist-0.014
nic-0.014
Token she
Token ID673
Feature activation+9.95e-01

Pos logit contributions
 exercises+0.002
ured+0.002
 poses+0.002
+0.002
 therapist+0.002
Neg logit contributions
rou-0.002
arl-0.002
run-0.002
ounces-0.001
 upon-0.001
Token found
Token ID1043
Feature activation+1.30e+00

Pos logit contributions
rou+0.019
run+0.018
o+0.016
 upon+0.016
om+0.015
Neg logit contributions
 exercises-0.026
 poses-0.026
ured-0.024
seek-0.024
nic-0.024
Token a
Token ID257
Feature activation+1.40e+00

Pos logit contributions
arl+0.024
rou+0.024
run+0.023
ounces+0.019
om+0.019
Neg logit contributions
 exercises-0.032
ured-0.032
 poses-0.032
 therapist-0.032
-0.031
Token new
Token ID649
Feature activation-5.41e-01

Pos logit contributions
arl+0.034
run+0.031
rou+0.031
ucker+0.027
ounces+0.027
Neg logit contributions
 exercises-0.030
 therapist-0.029
 poses-0.028
ger-0.028
ured-0.028
Token friend
Token ID1545
Feature activation+1.00e+00

Pos logit contributions
ured+0.013
 exercises+0.013
 poses+0.012
 therapist+0.012
+0.012
Neg logit contributions
arl-0.011
rou-0.011
run-0.010
om-0.009
ounces-0.009
Token.
Token ID13
Feature activation-3.52e-01

Pos logit contributions
arl+0.019
rou+0.018
run+0.017
ounces+0.015
om+0.015
Neg logit contributions
 exercises-0.026
ured-0.025
 therapist-0.025
 poses-0.025
ger-0.024
Token It
Token ID632
Feature activation-1.49e+00

Pos logit contributions
 exercises+0.008
ured+0.008
 poses+0.007
 therapist+0.007
+0.007
Neg logit contributions
arl-0.008
rou-0.008
run-0.007
ounces-0.007
ucker-0.006
Token was
Token ID373
Feature activation-4.62e-01

Pos logit contributions
 poses+0.039
 exercises+0.039
seek+0.039
ured+0.039
edly+0.036
Neg logit contributions
run-0.026
rou-0.026
o-0.024
 Tim-0.023
 Johnny-0.022
Token a
Token ID257
Feature activation+2.20e+00

Pos logit contributions
 exercises+0.011
ured+0.011
 poses+0.011
 therapist+0.011
+0.011
Neg logit contributions
arl-0.009
rou-0.009
run-0.008
ounces-0.007
om-0.007
Token big
Token ID1263
Feature activation+1.93e+00

Pos logit contributions
arl+0.052
run+0.050
rou+0.049
ucker+0.044
ounces+0.043
Neg logit contributions
 therapist-0.045
 exercises-0.045
ger-0.044
ured-0.044
 poses-0.043
Token,
Token ID11
Feature activation+5.59e-01

Pos logit contributions
arl+0.046
rou+0.044
run+0.043
ucker+0.038
ounces+0.038
Neg logit contributions
 exercises-0.040
ger-0.039
 therapist-0.039
ured-0.039
 poses-0.039
Token feeling
Token ID4203
Feature activation+1.20e+00

Pos logit contributions
arl+0.020
rou+0.019
run+0.018
ounces+0.016
om+0.016
Neg logit contributions
 exercises-0.022
ured-0.022
 poses-0.022
 therapist-0.022
-0.021
Token proud
Token ID6613
Feature activation+1.38e+00

Pos logit contributions
rou+0.015
run+0.014
arl+0.014
 late+0.012
joy+0.011
Neg logit contributions
seek-0.037
bees-0.036
 poses-0.036
 exercises-0.036
ured-0.036
Token of
Token ID286
Feature activation+2.61e+00

Pos logit contributions
arl+0.021
rou+0.020
run+0.019
ounces+0.015
om+0.015
Neg logit contributions
 exercises-0.041
ured-0.041
 poses-0.040
 therapist-0.040
-0.039
Token herself
Token ID5223
Feature activation+1.07e+00

Pos logit contributions
arl+0.089
rou+0.086
run+0.084
ounces+0.077
om+0.076
Neg logit contributions
 exercises-0.028
ured-0.028
 therapist-0.027
 poses-0.027
-0.025
Token.
Token ID13
Feature activation+5.14e-01

Pos logit contributions
arl+0.021
rou+0.020
run+0.019
ounces+0.017
ucker+0.016
Neg logit contributions
 exercises-0.027
ured-0.027
 poses-0.026
 therapist-0.026
ger-0.025
Token
Token ID220
Feature activation-4.14e-01

Pos logit contributions
arl+0.011
rou+0.011
run+0.010
ounces+0.009
hin+0.009
Neg logit contributions
 therapist-0.012
-0.011
 exercises-0.011
ured-0.011
 poses-0.011
Token
Token ID198
Feature activation+5.01e-01

Pos logit contributions
 exercises+0.011
+0.011
 poses+0.011
ured+0.011
 therapist+0.011
Neg logit contributions
arl-0.006
rou-0.006
run-0.005
ounces-0.005
etchup-0.005
Token
Token ID198
Feature activation-4.92e-02

Pos logit contributions
arl+0.007
rou+0.007
run+0.006
ounces+0.005
ucker+0.005
Neg logit contributions
 therapist-0.015
 exercises-0.015
ured-0.014
-0.014
 poses-0.014
TokenH
Token ID39
Feature activation-6.59e-01

Pos logit contributions
ger+0.001
 exercises+0.001
 poses+0.001
ured+0.001
seek+0.001
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
om-0.001
 upon-0.001
Tokenannah
Token ID25761
Feature activation-1.90e+00

Pos logit contributions
 exercises+0.015
 poses+0.014
ured+0.013
ger+0.013
 therapist+0.013
Neg logit contributions
arl-0.016
rou-0.015
run-0.014
om-0.013
ucker-0.013
Token and
Token ID290
Feature activation-1.10e-01

Pos logit contributions
 exercises+0.046
 poses+0.044
ured+0.043
ger+0.042
+0.042
Neg logit contributions
rou-0.039
run-0.035
arl-0.034
om-0.033
o-0.033
Token
Token ID198
Feature activation-3.74e-01

Pos logit contributions
 poses+0.017
 exercises+0.017
ured+0.017
+0.016
+0.016
Neg logit contributions
rou-0.027
arl-0.022
run-0.022
arse-0.020
ounces-0.020
Token
Token ID198
Feature activation-1.01e+00

Pos logit contributions
ves+0.009
Then+0.009
+0.008
Her+0.008
 exercises+0.008
Neg logit contributions
arl-0.008
rou-0.008
run-0.007
 bruised-0.007
joy-0.006
Token"
Token ID1
Feature activation-2.17e+00

Pos logit contributions
 exercises+0.022
 poses+0.022
ured+0.021
 therapist+0.021
seek+0.021
Neg logit contributions
arl-0.024
rou-0.022
run-0.021
om-0.020
ounces-0.019
TokenLet
Token ID5756
Feature activation-1.32e+00

Pos logit contributions
 exercises+0.054
ured+0.054
 poses+0.052
 therapist+0.051
+0.051
Neg logit contributions
arl-0.045
rou-0.038
run-0.036
etchup-0.034
om-0.033
Token's
Token ID338
Feature activation-3.50e+00

Pos logit contributions
edly+0.031
seek+0.030
ured+0.030
 poses+0.030
 exercises+0.029
Neg logit contributions
arl-0.027
rou-0.025
run-0.024
om-0.024
 Johnny-0.023
Token make
Token ID787
Feature activation-3.21e+00

Pos logit contributions
 exercises+0.069
ured+0.068
 poses+0.067
 therapist+0.067
+0.064
Neg logit contributions
arl-0.088
rou-0.084
run-0.081
ounces-0.072
om-0.072
Token a
Token ID257
Feature activation-6.35e-01

Pos logit contributions
 exercises+0.067
 poses+0.064
ured+0.063
 therapist+0.059
nic+0.058
Neg logit contributions
arl-0.084
rou-0.076
run-0.076
ounces-0.065
ucker-0.064
Token cake
Token ID12187
Feature activation-2.66e+00

Pos logit contributions
 exercises+0.013
 poses+0.012
 therapist+0.012
 vows+0.012
ured+0.012
Neg logit contributions
arl-0.016
run-0.016
rou-0.015
ucker-0.013
ounces-0.013
Token!"
Token ID2474
Feature activation-1.30e+00

Pos logit contributions
 exercises+0.077
ured+0.077
 therapist+0.076
 poses+0.075
+0.075
Neg logit contributions
arl-0.040
rou-0.038
run-0.035
om-0.031
ounces-0.030
Token Lily
Token ID20037
Feature activation-1.45e-01

Pos logit contributions
 exercises+0.039
ured+0.039
 therapist+0.038
 poses+0.038
ger+0.037
Neg logit contributions
arl-0.019
rou-0.019
run-0.017
ounces-0.013
om-0.013
Token said
Token ID531
Feature activation+4.96e-01

Pos logit contributions
 exercises+0.004
ured+0.004
zed+0.004
ger+0.004
+0.004
Neg logit contributions
run-0.002
om-0.002
rou-0.002
o-0.002
ucker-0.002
Token with
Token ID351
Feature activation-1.02e+00

Pos logit contributions
ured+0.038
 therapist+0.036
 exercises+0.036
 vows+0.036
cious+0.036
Neg logit contributions
arl-0.022
rou-0.021
run-0.021
om-0.017
ounces-0.016
Token her
Token ID607
Feature activation-7.19e-01

Pos logit contributions
ured+0.022
ger+0.021
 therapist+0.020
nic+0.020
cious+0.020
Neg logit contributions
arl-0.023
run-0.023
rou-0.023
 Johnny-0.020
om-0.020
Token toys
Token ID14958
Feature activation-2.14e+00

Pos logit contributions
ured+0.018
ger+0.017
seek+0.017
nic+0.016
 exercises+0.016
Neg logit contributions
arl-0.014
rou-0.014
run-0.013
om-0.012
ounces-0.011
Token,
Token ID11
Feature activation+3.51e-01

Pos logit contributions
ured+0.059
 exercises+0.058
+0.057
 poses+0.057
 therapist+0.057
Neg logit contributions
arl-0.036
rou-0.035
run-0.033
ounces-0.027
om-0.027
Token but
Token ID475
Feature activation-8.82e-01

Pos logit contributions
arl+0.008
rou+0.008
run+0.007
ounces+0.007
om+0.006
Neg logit contributions
 exercises-0.008
 poses-0.008
ured-0.008
 therapist-0.007
-0.007
Token accidentally
Token ID14716
Feature activation-4.09e-01

Pos logit contributions
+0.022
ured+0.022
 poses+0.022
seek+0.022
 exercises+0.022
Neg logit contributions
 Tim-0.015
 upon-0.015
rou-0.015
run-0.014
 Johnny-0.014
Token split
Token ID6626
Feature activation-1.63e+00

Pos logit contributions
ured+0.011
 exercises+0.011
 poses+0.011
+0.010
 therapist+0.010
Neg logit contributions
run-0.007
rou-0.007
arl-0.007
ounces-0.006
om-0.006
Token her
Token ID607
Feature activation-4.56e-01

Pos logit contributions
 exercises+0.034
ured+0.033
 poses+0.032
 therapist+0.032
+0.031
Neg logit contributions
arl-0.040
rou-0.038
run-0.036
ounces-0.032
om-0.032
Token favorite
Token ID4004
Feature activation-1.70e+00

Pos logit contributions
 exercises+0.011
ured+0.011
 therapist+0.011
 poses+0.011
seek+0.011
Neg logit contributions
arl-0.009
rou-0.009
run-0.008
om-0.007
ounces-0.007
Token doll
Token ID3654
Feature activation+2.35e-01

Pos logit contributions
ured+0.045
seek+0.043
 therapist+0.042
nic+0.042
+0.041
Neg logit contributions
run-0.030
arl-0.028
om-0.027
rou-0.026
 upon-0.023
Token in
Token ID287
Feature activation+6.94e-01

Pos logit contributions
arl+0.004
rou+0.004
run+0.004
om+0.003
ounces+0.003
Neg logit contributions
 exercises-0.006
 therapist-0.006
ured-0.006
 poses-0.006
-0.006
Token.
Token ID13
Feature activation+7.27e-02

Pos logit contributions
 exercises+0.010
ured+0.010
 poses+0.010
 therapist+0.010
ger+0.010
Neg logit contributions
arl-0.010
rou-0.009
run-0.009
ounces-0.008
ucker-0.008
Token He
Token ID679
Feature activation-6.48e-01

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
ucker+0.001
Neg logit contributions
 exercises-0.002
ured-0.002
-0.002
 therapist-0.002
 poses-0.002
Token wanted
Token ID2227
Feature activation+7.08e-01

Pos logit contributions
 exercises+0.016
 poses+0.015
ured+0.015
seek+0.015
 therapist+0.015
Neg logit contributions
rou-0.012
run-0.012
 Johnny-0.011
arl-0.011
o-0.011
Token to
Token ID284
Feature activation-6.56e-01

Pos logit contributions
arl+0.012
rou+0.011
run+0.010
ounces+0.008
om+0.008
Neg logit contributions
 exercises-0.021
 poses-0.020
ured-0.020
 therapist-0.020
seek-0.019
Token make
Token ID787
Feature activation-2.70e+00

Pos logit contributions
ured+0.015
 therapist+0.015
 exercises+0.015
 poses+0.015
+0.015
Neg logit contributions
arl-0.013
rou-0.012
run-0.012
ounces-0.010
 Johnny-0.010
Token things
Token ID1243
Feature activation-2.48e+00

Pos logit contributions
 exercises+0.058
 poses+0.054
ured+0.052
 therapist+0.050
+0.050
Neg logit contributions
arl-0.073
rou-0.066
run-0.063
ounces-0.057
ucker-0.056
Token better
Token ID1365
Feature activation-2.24e-01

Pos logit contributions
 exercises+0.063
ured+0.063
 therapist+0.062
 poses+0.061
+0.060
Neg logit contributions
arl-0.048
rou-0.045
run-0.043
ounces-0.036
om-0.036
Token but
Token ID475
Feature activation+5.74e-01

Pos logit contributions
 exercises+0.006
 therapist+0.006
ured+0.006
+0.006
 poses+0.006
Neg logit contributions
arl-0.004
rou-0.003
run-0.003
om-0.003
o-0.003
Token didn
Token ID1422
Feature activation+1.39e+00

Pos logit contributions
run+0.011
rou+0.011
arl+0.011
 upon+0.010
 Johnny+0.010
Neg logit contributions
ured-0.014
 poses-0.014
 exercises-0.014
nic-0.013
-0.013
Token't
Token ID470
Feature activation+1.32e+00

Pos logit contributions
arl+0.027
rou+0.025
run+0.025
ounces+0.021
om+0.021
Neg logit contributions
 exercises-0.035
ured-0.035
 poses-0.034
 therapist-0.034
-0.033
Token know
Token ID760
Feature activation+1.09e+00

Pos logit contributions
arl+0.030
rou+0.030
run+0.028
ounces+0.025
om+0.025
Neg logit contributions
 exercises-0.029
ured-0.028
 poses-0.028
 therapist-0.028
ger-0.027
Token green
Token ID4077
Feature activation-1.03e+00

Pos logit contributions
rou+0.007
arl+0.007
run+0.007
om+0.006
ounces+0.005
Neg logit contributions
 therapist-0.019
 exercises-0.019
ured-0.019
seek-0.019
 poses-0.019
Token.
Token ID13
Feature activation+3.69e-01

Pos logit contributions
ured+0.029
 therapist+0.028
nic+0.028
+0.027
zed+0.027
Neg logit contributions
arl-0.016
rou-0.016
run-0.015
om-0.014
ounces-0.013
Token
Token ID198
Feature activation+6.00e-01

Pos logit contributions
rou+0.008
arl+0.007
run+0.007
hin+0.006
ounces+0.006
Neg logit contributions
 exercises-0.009
-0.009
 poses-0.009
 therapist-0.008
seek-0.008
Token
Token ID198
Feature activation-2.07e-01

Pos logit contributions
rou+0.013
arl+0.013
run+0.011
 bruised+0.011
joy+0.011
Neg logit contributions
ves-0.014
Then-0.013
-0.013
Her-0.013
 exercises-0.013
Token"
Token ID1
Feature activation-2.02e+00

Pos logit contributions
 poses+0.006
 therapist+0.006
 exercises+0.006
edly+0.006
ured+0.006
Neg logit contributions
arl-0.004
rou-0.003
om-0.003
run-0.003
o-0.003
TokenLook
Token ID8567
Feature activation-3.95e+00

Pos logit contributions
ured+0.055
 exercises+0.055
 therapist+0.053
 poses+0.052
ger+0.052
Neg logit contributions
arl-0.036
rou-0.033
run-0.031
om-0.027
ounces-0.027
Token at
Token ID379
Feature activation-8.08e-01

Pos logit contributions
 exercises+0.075
ured+0.074
+0.072
 poses+0.070
nic+0.069
Neg logit contributions
arl-0.102
rou-0.095
run-0.092
ucker-0.081
etchup-0.080
Token my
Token ID616
Feature activation-2.10e+00

Pos logit contributions
 exercises+0.018
 poses+0.017
ured+0.017
+0.016
 therapist+0.015
Neg logit contributions
arl-0.019
run-0.018
rou-0.018
ucker-0.015
ounces-0.015
Token pattern
Token ID3912
Feature activation-2.79e+00

Pos logit contributions
 exercises+0.041
 poses+0.040
ured+0.038
ger+0.037
 therapist+0.036
Neg logit contributions
arl-0.055
rou-0.051
run-0.050
ucker-0.044
ounces-0.044
Token,
Token ID11
Feature activation-5.62e-02

Pos logit contributions
ured+0.086
 therapist+0.085
 exercises+0.084
+0.083
 poses+0.082
Neg logit contributions
arl-0.037
rou-0.036
run-0.035
om-0.029
ounces-0.027
Token Tom
Token ID4186
Feature activation-2.62e-01

Pos logit contributions
 exercises+0.002
ured+0.002
 poses+0.002
 therapist+0.002
nic+0.002
Neg logit contributions
arl-0.001
rou-0.001
run-0.001
ounces-0.001
ucker-0.001
TokenTim
Token ID14967
Feature activation-6.14e-01

Pos logit contributions
 poses+0.010
 exercises+0.010
 therapist+0.010
ured+0.010
 vows+0.009
Neg logit contributions
arl-0.009
rou-0.009
run-0.009
om-0.008
o-0.007
Tokenmy
Token ID1820
Feature activation-1.04e+00

Pos logit contributions
 exercises+0.017
seek+0.017
ger+0.016
 therapist+0.016
 poses+0.016
Neg logit contributions
arl-0.011
run-0.009
rou-0.009
ounces-0.007
ucker-0.007
Token's
Token ID338
Feature activation-6.05e-01

Pos logit contributions
 exercises+0.022
ured+0.022
 poses+0.021
 therapist+0.021
+0.020
Neg logit contributions
rou-0.024
arl-0.023
run-0.023
om-0.021
o-0.020
Token team
Token ID1074
Feature activation-1.87e+00

Pos logit contributions
 exercises+0.016
ured+0.016
 poses+0.016
 therapist+0.016
+0.015
Neg logit contributions
arl-0.011
rou-0.010
run-0.010
ounces-0.008
om-0.008
Token practiced
Token ID19893
Feature activation-2.91e+00

Pos logit contributions
 exercises+0.040
ured+0.039
 poses+0.039
 therapist+0.038
+0.037
Neg logit contributions
run-0.042
rou-0.041
arl-0.041
om-0.037
ounces-0.036
Token hard
Token ID1327
Feature activation-3.47e+00

Pos logit contributions
ured+0.071
 therapist+0.071
ger+0.067
+0.067
 exercises+0.066
Neg logit contributions
arl-0.056
rou-0.055
run-0.055
om-0.048
ounces-0.046
Token and
Token ID290
Feature activation-4.56e-01

Pos logit contributions
ured+0.088
 therapist+0.088
 exercises+0.087
 poses+0.085
+0.085
Neg logit contributions
arl-0.066
rou-0.062
run-0.061
om-0.052
ounces-0.050
Token they
Token ID484
Feature activation-1.76e+00

Pos logit contributions
 therapist+0.010
ured+0.010
+0.009
 exercises+0.009
 vows+0.009
Neg logit contributions
run-0.010
rou-0.010
arl-0.009
 Johnny-0.009
om-0.009
Token were
Token ID547
Feature activation-9.64e-01

Pos logit contributions
ured+0.037
 exercises+0.036
 therapist+0.036
 poses+0.036
+0.034
Neg logit contributions
arl-0.039
rou-0.039
run-0.039
om-0.035
ounces-0.034
Token excited
Token ID6568
Feature activation+2.29e-01

Pos logit contributions
 exercises+0.024
ured+0.024
 poses+0.023
 therapist+0.023
+0.023
Neg logit contributions
arl-0.019
rou-0.018
run-0.018
ounces-0.015
om-0.015
Token to
Token ID284
Feature activation-2.01e-01

Pos logit contributions
arl+0.003
rou+0.003
run+0.003
ounces+0.002
om+0.002
Neg logit contributions
 exercises-0.007
ured-0.007
 therapist-0.007
 poses-0.007
-0.006
Token
Token ID241
Feature activation-8.17e-01

Pos logit contributions
 therapist+0.005
 poses+0.004
bees+0.004
 exercises+0.004
 vows+0.004
Neg logit contributions
arl-0.012
rou-0.011
run-0.010
om-0.010
ounces-0.010
TokenIf
Token ID1532
Feature activation+1.41e+00

Pos logit contributions
 exercises+0.016
 poses+0.016
ured+0.016
 therapist+0.016
+0.015
Neg logit contributions
arl-0.021
rou-0.019
run-0.019
ounces-0.017
ucker-0.017
Token you
Token ID345
Feature activation-1.97e+00

Pos logit contributions
arl+0.024
rou+0.023
run+0.021
ounces+0.018
om+0.018
Neg logit contributions
 exercises-0.039
ured-0.039
 poses-0.038
 therapist-0.038
-0.037
Token don
Token ID836
Feature activation-1.76e+00

Pos logit contributions
 exercises+0.046
ured+0.045
 therapist+0.044
 poses+0.043
ger+0.043
Neg logit contributions
arl-0.042
rou-0.039
run-0.037
ounces-0.034
joy-0.034
Tokenâ
Token ID22940
Feature activation-9.95e-01

Pos logit contributions
 exercises+0.043
ured+0.042
 therapist+0.041
 poses+0.041
+0.040
Neg logit contributions
arl-0.036
rou-0.035
run-0.033
ounces-0.029
om-0.028
Token
Token ID26391
Feature activation-3.87e+00

Pos logit contributions
 poses+0.015
 therapist+0.015
ured+0.015
 exercises+0.015
 vows+0.015
Neg logit contributions
arl-0.029
run-0.026
rou-0.026
o-0.023
ucker-0.023
Token
Token ID8151
Feature activation+1.07e+00

Pos logit contributions
 exercises+0.082
+0.079
seek+0.079
 poses+0.078
ured+0.077
Neg logit contributions
arl-0.093
rou-0.088
run-0.082
ounces-0.072
ucker-0.072
Tokent
Token ID83
Feature activation-2.23e-01

Pos logit contributions
arl+0.027
rou+0.027
ounces+0.024
hin+0.023
ucker+0.023
Neg logit contributions
seek-0.017
ger-0.016
 therapist-0.016
 exercises-0.016
ves-0.016
Token do
Token ID466
Feature activation-2.99e+00

Pos logit contributions
 exercises+0.005
ured+0.005
 therapist+0.005
 poses+0.005
+0.005
Neg logit contributions
arl-0.005
rou-0.005
run-0.004
om-0.004
ounces-0.004
Token what
Token ID644
Feature activation-6.88e-01

Pos logit contributions
ured+0.083
 therapist+0.083
edly+0.081
 poses+0.081
seek+0.080
Neg logit contributions
arl-0.048
run-0.046
rou-0.045
ounces-0.035
om-0.035
Token I
Token ID314
Feature activation+1.79e+00

Pos logit contributions
ured+0.021
seek+0.020
edly+0.020
 therapist+0.019
ger+0.019
Neg logit contributions
run-0.010
arl-0.010
rou-0.010
 Johnny-0.008
om-0.008
Token to
Token ID284
Feature activation-2.43e-01

Pos logit contributions
 therapist+0.089
+0.088
ured+0.087
ger+0.086
+0.086
Neg logit contributions
run-0.010
rou-0.009
arl-0.005
om-0.004
o-0.003
Token her
Token ID607
Feature activation+3.31e-02

Pos logit contributions
 exercises+0.004
ured+0.004
 poses+0.004
 therapist+0.004
+0.004
Neg logit contributions
arl-0.007
rou-0.007
run-0.007
ounces-0.006
om-0.006
Token mom
Token ID1995
Feature activation+1.60e+00

Pos logit contributions
arl+0.001
rou+0.001
run+0.001
ounces+0.001
ucker+0.001
Neg logit contributions
 exercises-0.001
 poses-0.001
 therapist-0.001
ured-0.001
-0.001
Token.
Token ID13
Feature activation-4.96e-01

Pos logit contributions
arl+0.031
rou+0.030
run+0.029
ounces+0.024
om+0.024
Neg logit contributions
 exercises-0.040
ured-0.040
 poses-0.039
 therapist-0.039
-0.038
Token "
Token ID366
Feature activation-2.35e+00

Pos logit contributions
 exercises+0.012
ger+0.012
nic+0.012
seek+0.012
ured+0.012
Neg logit contributions
arl-0.011
run-0.009
rou-0.009
om-0.009
ucker-0.008
TokenLook
Token ID8567
Feature activation-3.95e+00

Pos logit contributions
 exercises+0.055
ured+0.054
 poses+0.052
ger+0.051
 therapist+0.050
Neg logit contributions
arl-0.052
rou-0.049
run-0.045
om-0.041
ounces-0.040
Token,
Token ID11
Feature activation-7.30e-01

Pos logit contributions
 exercises+0.082
ured+0.081
 poses+0.079
 therapist+0.078
+0.077
Neg logit contributions
arl-0.095
rou-0.090
run-0.088
ounces-0.077
om-0.076
Token mom
Token ID1995
Feature activation+1.15e+00

Pos logit contributions
ured+0.021
 exercises+0.021
 poses+0.021
seek+0.020
 therapist+0.020
Neg logit contributions
arl-0.012
rou-0.011
run-0.010
ounces-0.009
om-0.008
Token,
Token ID11
Feature activation-8.14e-01

Pos logit contributions
arl+0.033
run+0.029
rou+0.028
ucker+0.025
ounces+0.025
Neg logit contributions
 exercises-0.018
seek-0.017
 poses-0.017
ger-0.017
 therapist-0.017
Token I
Token ID314
Feature activation+7.11e-01

Pos logit contributions
 exercises+0.020
ured+0.020
 poses+0.020
 therapist+0.019
+0.019
Neg logit contributions
arl-0.016
rou-0.016
run-0.015
ounces-0.013
om-0.013
Token made
Token ID925
Feature activation-1.91e+00

Pos logit contributions
arl+0.016
rou+0.016
run+0.015
om+0.013
ounces+0.013
Neg logit contributions
 exercises-0.015
ured-0.015
 therapist-0.015
 poses-0.015
-0.014
Token will
Token ID481
Feature activation+2.50e-01

Pos logit contributions
seek+0.049
ured+0.049
 exercises+0.048
 therapist+0.048
 poses+0.047
Neg logit contributions
rou-0.034
arl-0.034
run-0.032
om-0.029
ucker-0.027
Token be
Token ID307
Feature activation-7.01e-01

Pos logit contributions
arl+0.006
rou+0.006
run+0.005
om+0.005
ounces+0.005
Neg logit contributions
 exercises-0.005
 therapist-0.005
 poses-0.005
ured-0.005
seek-0.005
Token ready
Token ID3492
Feature activation-1.81e-01

Pos logit contributions
 exercises+0.016
ured+0.016
 poses+0.015
+0.015
 therapist+0.015
Neg logit contributions
arl-0.016
rou-0.015
run-0.015
ounces-0.013
om-0.012
Token in
Token ID287
Feature activation+5.46e-01

Pos logit contributions
 exercises+0.005
ured+0.005
 poses+0.005
 therapist+0.005
+0.004
Neg logit contributions
arl-0.003
rou-0.003
run-0.003
ounces-0.003
om-0.003
Token about
Token ID546
Feature activation+1.70e+00

Pos logit contributions
arl+0.013
rou+0.012
run+0.011
ounces+0.010
ucker+0.010
Neg logit contributions
 exercises-0.012
ured-0.012
 poses-0.011
 therapist-0.011
-0.011
Token 30
Token ID1542
Feature activation-5.73e+00

Pos logit contributions
arl+0.035
rou+0.032
run+0.031
ounces+0.027
ucker+0.026
Neg logit contributions
 exercises-0.042
ured-0.042
 poses-0.041
 therapist-0.040
-0.039
Token minutes
Token ID2431
Feature activation-6.88e-01

Pos logit contributions
ured+0.133
seek+0.132
 therapist+0.131
 exercises+0.129
ger+0.129
Neg logit contributions
arl-0.114
rou-0.114
run-0.108
om-0.101
etchup-0.096
Token.
Token ID13
Feature activation+6.33e-01

Pos logit contributions
ured+0.017
 exercises+0.017
 poses+0.016
 therapist+0.016
+0.016
Neg logit contributions
arl-0.014
rou-0.013
run-0.013
ounces-0.011
ucker-0.011
Token You
Token ID921
Feature activation-5.25e-01

Pos logit contributions
arl+0.014
rou+0.013
run+0.012
ounces+0.011
om+0.011
Neg logit contributions
ured-0.014
 exercises-0.014
 therapist-0.014
-0.014
 poses-0.014
Token can
Token ID460
Feature activation-8.81e-01

Pos logit contributions
 exercises+0.013
ured+0.013
 poses+0.012
 therapist+0.012
+0.012
Neg logit contributions
arl-0.011
rou-0.010
run-0.010
ounces-0.009
ucker-0.008
Token help
Token ID1037
Feature activation-1.27e+00

Pos logit contributions
 exercises+0.020
ured+0.020
 therapist+0.020
 poses+0.020
ger+0.020
Neg logit contributions
arl-0.019
rou-0.018
run-0.017
om-0.015
ounces-0.015