TOP ACTIVATIONS
MAX = 7.849

" Hello , Farmer Joe ," they say .
said , " I am Doctor Lily . I will listen
" Sorry , Farmer Joe ," Sam says .
" Hello there !" Doctor Rock er called .
" Sorry , Farmer Joe ," Ben says .
said , " I am Nurse Sam . I will give
" Hello , this is Doctor Ben . Do you have
what this mineral is ?" Doctor Rock er asked .
" Are you okay , Princess Lily ?" asked the servant
Hello Princess ! I am here to
can we do ?" asked Little Jack .
you for serving me , Doctor ," said Lily .
" Hello , I am Princess Purple !" Lily says as
at the rain !" shouted Little Matt , excited ly pointing
we going today ?" asked Little Sarah . She was hopping
" Help ! Help !" Little Annie shouted from the top
friend said , " Hi Farmer ! I have some hay
. " Hello , my darling !" he said , smiling
" Hello , I am Miss Cat , and I am
Moon who said , " Little one , be careful and

MIN ACTIVATIONS
MIN = -7.020

as fast as he could to fly the strip .
you . You learned how to share and how to be
and soon they were able to stop him from scoring any
so she would be able to open and close the lock
together he and Lisa managed to fix the broken board .
managed to figure out how to move the wagon by herself
kids . You learned how to solve your problem and be
, Sara ! You managed to plant your first bulb !"
. They clip the wires to their toy car and the
Why don 't you try to make something else with the
There were so many sticks to choose from ! The girl
different . You learned how to zip the ball . You
children . You learned how to share and be kind .
tell them how he managed to make his hair so perfect
. He taught them how to cook , how to play
impressed with how he managed to try and make something so
the black board so they could buy some chalk . Lily
and worked until she managed to solve the puzzle . She
and soon it was able to fly again . Ros ie
, Bob and Jay managed to make the most beautiful castle

INTERVAL 4.131 - 7.849
CONTAINS 0.943%

Mom my , what â s this ?
a time , there was a big bird called an ostr
" Look , Ben , a big red ball
a time there was a little pig named Pe ppa .
to Jill , " Look , our bathing suits match !"

INTERVAL 0.414 - 4.131
CONTAINS 39.075%

a noise . It was a monster ! The monster was
grace fully in the sky and she knew that the crane
doctor was very nice and talked a lot with them .
, " This candy is too sour for me ." Lily
with his pillow , and was sad . His mom saw

INTERVAL -3.303 - 0.414
CONTAINS 57.196%

It was time to go inside . But Miss
quickly grabbed the hose and started spraying water on his yellow
and they applauded him too . It was a great day
, car . I drew a pink flower . Isn 't
They saw a big metal thing with wheels and a horn

INTERVAL -7.020 - -3.303
CONTAINS 2.786%

and help them wash . They give them towels and p
clever before . He wanted to watch the bird as it
saw the bird and wanted to help . "
wanted to escape but he couldn 't move . Joe started
. They couldn 't wait to show it to their mum
Token
Token ID198
Feature activation-9.92e-01

Pos logit contributions
 oh+0.013
 dear+0.011
Run+0.010
 gust+0.010
 Please+0.010
Neg logit contributions
 ease-0.015
ws-0.014
 stairs-0.013
 driveway-0.013
zip-0.013
Token
Token ID198
Feature activation+1.31e-01

Pos logit contributions
 ease+0.022
ws+0.022
ips+0.021
 surfaces+0.021
zip+0.021
Neg logit contributions
 oh-0.023
 dear-0.022
Run-0.022
 Uncle-0.021
 Please-0.021
Token"
Token ID1
Feature activation+5.05e-01

Pos logit contributions
 oh+0.003
 dear+0.003
 Please+0.003
 gust+0.003
 bye+0.003
Neg logit contributions
 ease-0.003
zip-0.003
ws-0.003
pack-0.003
nob-0.003
TokenHello
Token ID15496
Feature activation+5.20e+00

Pos logit contributions
 oh+0.011
 dear+0.011
 sorry+0.010
 Please+0.010
 bye+0.010
Neg logit contributions
 ease-0.012
ws-0.011
zip-0.011
nob-0.010
ips-0.010
Token,
Token ID11
Feature activation+4.94e+00

Pos logit contributions
 oh+0.089
 gust+0.085
 Please+0.082
Run+0.081
 sorry+0.079
Neg logit contributions
 ease-0.121
 stairs-0.116
 blocks-0.115
ws-0.114
cast-0.114
Token Farmer
Token ID37755
Feature activation+7.85e+00

Pos logit contributions
 oh+0.066
 gust+0.061
 Please+0.061
Run+0.058
 bye+0.053
Neg logit contributions
 ease-0.141
 build-0.133
 stairs-0.132
 rebuild-0.129
ws-0.129
Token Joe
Token ID5689
Feature activation+4.77e+00

Pos logit contributions
 gust+0.136
 oh+0.134
Run+0.126
 dear+0.121
 bye+0.120
Neg logit contributions
sey-0.197
 ease-0.192
nob-0.189
ws-0.188
ge-0.188
Token,"
Token ID553
Feature activation+3.40e+00

Pos logit contributions
 oh+0.121
 dear+0.110
Run+0.108
 gust+0.108
 Please+0.105
Neg logit contributions
ws-0.094
 ease-0.094
nob-0.089
sey-0.087
ge-0.087
Token they
Token ID484
Feature activation+2.71e-01

Pos logit contributions
 dear+0.064
 oh+0.062
 bye+0.060
 gust+0.060
 Please+0.057
Neg logit contributions
 ease-0.078
 build-0.074
 stairs-0.072
 bank-0.070
 rebuild-0.069
Token say
Token ID910
Feature activation+2.51e+00

Pos logit contributions
 oh+0.006
 dear+0.005
Run+0.005
 Please+0.005
 Can+0.005
Neg logit contributions
 ease-0.007
ws-0.006
 bank-0.006
zip-0.006
 surfaces-0.006
Token.
Token ID13
Feature activation+3.24e+00

Pos logit contributions
 oh+0.073
 dear+0.071
Run+0.071
 gust+0.067
 goat+0.067
Neg logit contributions
 ease-0.036
 aside-0.035
nob-0.033
 neatly-0.033
 rebuild-0.033
Token said
Token ID531
Feature activation+1.65e+00

Pos logit contributions
 oh+0.016
 dear+0.015
 Please+0.015
Run+0.015
 bye+0.014
Neg logit contributions
 ease-0.015
ws-0.014
zip-0.014
nob-0.014
 stairs-0.013
Token,
Token ID11
Feature activation+2.83e+00

Pos logit contributions
 oh+0.029
Run+0.028
 dear+0.026
 Please+0.025
 Uncle+0.024
Neg logit contributions
 ease-0.044
ws-0.042
ics-0.042
 stairs-0.041
 tops-0.041
Token "
Token ID366
Feature activation+3.96e-01

Pos logit contributions
 oh+0.053
 dear+0.050
Run+0.050
heart+0.046
 gust+0.046
Neg logit contributions
 ease-0.071
 stairs-0.068
nob-0.067
 neatly-0.067
ws-0.066
TokenI
Token ID40
Feature activation-1.35e+00

Pos logit contributions
 oh+0.009
 dear+0.008
 Please+0.008
Run+0.008
 Uncle+0.007
Neg logit contributions
 ease-0.010
ws-0.010
ips-0.009
zip-0.009
ge-0.009
Token am
Token ID716
Feature activation+1.90e+00

Pos logit contributions
 ease+0.026
ws+0.025
zip+0.023
ips+0.023
nob+0.023
Neg logit contributions
 oh-0.037
 dear-0.035
 Please-0.034
Run-0.034
 Uncle-0.033
Token Doctor
Token ID9356
Feature activation+7.68e+00

Pos logit contributions
 oh+0.042
 dear+0.040
Run+0.039
 Please+0.037
 bye+0.036
Neg logit contributions
 ease-0.046
ws-0.043
zip-0.042
ips-0.041
nob-0.041
Token Lily
Token ID20037
Feature activation+3.35e+00

Pos logit contributions
Run+0.156
 gust+0.156
 oh+0.151
 gru+0.139
 dear+0.138
Neg logit contributions
 stairs-0.157
 blocks-0.154
 build-0.152
 ramp-0.151
 puzzles-0.149
Token.
Token ID13
Feature activation+2.18e+00

Pos logit contributions
 oh+0.089
Run+0.089
 gust+0.082
 dear+0.082
 hello+0.079
Neg logit contributions
 ease-0.058
ws-0.055
sey-0.055
nob-0.055
ips-0.054
Token I
Token ID314
Feature activation-2.42e+00

Pos logit contributions
 oh+0.041
 gust+0.035
Run+0.035
 dear+0.034
 gru+0.032
Neg logit contributions
 ease-0.057
zip-0.056
ws-0.055
nob-0.054
 stairs-0.054
Token will
Token ID481
Feature activation-3.12e+00

Pos logit contributions
 ease+0.049
ips+0.046
 surfaces+0.046
 stairs+0.045
ws+0.044
Neg logit contributions
 oh-0.061
 dear-0.060
 Please-0.058
Run-0.056
 sorry-0.055
Token listen
Token ID6004
Feature activation+1.48e+00

Pos logit contributions
ips+0.070
 ease+0.066
ws+0.065
 halves+0.063
ge+0.062
Neg logit contributions
 dear-0.077
 oh-0.074
 Please-0.072
 sorry-0.070
 bye-0.068
Token
Token ID198
Feature activation-6.46e-01

Pos logit contributions
 oh+0.046
 dear+0.043
 gust+0.041
Run+0.041
 pigeon+0.040
Neg logit contributions
 ease-0.030
ws-0.028
 stairs-0.028
zip-0.027
 driveway-0.027
Token
Token ID198
Feature activation+6.91e-01

Pos logit contributions
 ease+0.013
ws+0.012
zip+0.012
ips+0.012
nob+0.011
Neg logit contributions
 oh-0.017
 dear-0.016
 Please-0.016
Run-0.015
 Uncle-0.015
Token"
Token ID1
Feature activation+1.57e+00

Pos logit contributions
 dear+0.017
 oh+0.017
 gust+0.016
 bye+0.016
 hello+0.016
Neg logit contributions
zip-0.013
 ease-0.013
ws-0.013
pack-0.013
eds-0.012
TokenSorry
Token ID14385
Feature activation+5.18e+00

Pos logit contributions
 dear+0.038
 oh+0.035
 bye+0.034
 gust+0.034
 sorry+0.033
Neg logit contributions
zip-0.033
 ease-0.033
ws-0.030
pack-0.030
ips-0.030
Token,
Token ID11
Feature activation+3.94e+00

Pos logit contributions
Run+0.113
 oh+0.108
 gust+0.104
 photographer+0.102
 gru+0.101
Neg logit contributions
 blocks-0.096
 build-0.095
 stairs-0.092
 surfaces-0.091
 knots-0.087
Token Farmer
Token ID37755
Feature activation+7.67e+00

Pos logit contributions
bye+0.057
over+0.057
 Please+0.053
 oh+0.051
 pigeon+0.050
Neg logit contributions
 build-0.099
 stairs-0.097
 blocks-0.095
 ramp-0.094
 ease-0.094
Token Joe
Token ID5689
Feature activation+3.21e+00

Pos logit contributions
 gust+0.130
 oh+0.120
Run+0.115
 bye+0.114
 dear+0.112
Neg logit contributions
sey-0.190
 build-0.185
 stairs-0.184
 blocks-0.182
nob-0.181
Token,"
Token ID553
Feature activation+3.50e+00

Pos logit contributions
 oh+0.081
Run+0.072
 dear+0.071
 gust+0.070
 Please+0.069
Neg logit contributions
ws-0.064
 ease-0.063
nob-0.060
 bank-0.060
 rebuild-0.059
Token Sam
Token ID3409
Feature activation+5.36e-01

Pos logit contributions
 oh+0.056
 dear+0.054
 Please+0.051
 bye+0.050
over+0.049
Neg logit contributions
 ease-0.090
 build-0.089
 stairs-0.088
 blocks-0.086
 rebuild-0.084
Token says
Token ID1139
Feature activation+1.36e+00

Pos logit contributions
 oh+0.013
 dear+0.012
 Please+0.011
 gust+0.011
Run+0.011
Neg logit contributions
ws-0.012
 ease-0.011
nob-0.011
ips-0.010
 stairs-0.010
Token.
Token ID13
Feature activation+3.47e+00

Pos logit contributions
 oh+0.039
 dear+0.037
Run+0.035
 Please+0.035
 goat+0.034
Neg logit contributions
 ease-0.023
ws-0.022
ips-0.022
 aside-0.021
ge-0.021
Token
Token ID198
Feature activation-5.56e-01

Pos logit contributions
ws+0.047
 ease+0.047
 surfaces+0.043
ips+0.043
ge+0.042
Neg logit contributions
 oh-0.044
Run-0.042
 dear-0.041
 Please-0.041
 Uncle-0.040
Token"
Token ID1
Feature activation+7.79e-01

Pos logit contributions
 ease+0.012
zip+0.012
ws+0.012
nob+0.011
pack+0.011
Neg logit contributions
 oh-0.013
 dear-0.013
 sorry-0.012
 Please-0.011
 bye-0.011
TokenHello
Token ID15496
Feature activation+4.18e+00

Pos logit contributions
 oh+0.016
 dear+0.015
 Please+0.014
 bye+0.014
Run+0.013
Neg logit contributions
 ease-0.020
ws-0.019
zip-0.019
nob-0.018
ips-0.018
Token there
Token ID612
Feature activation+2.17e+00

Pos logit contributions
 oh+0.083
Run+0.075
 Please+0.074
 gust+0.074
 dear+0.074
Neg logit contributions
 ease-0.100
zip-0.097
ge-0.096
ws-0.094
 stairs-0.093
Token!"
Token ID2474
Feature activation+2.38e+00

Pos logit contributions
 oh+0.045
 dear+0.041
Run+0.041
 Please+0.039
 bye+0.039
Neg logit contributions
 ease-0.052
ics-0.049
 bank-0.048
 stairs-0.048
ws-0.048
Token Doctor
Token ID9356
Feature activation+7.56e+00

Pos logit contributions
 dear+0.049
 oh+0.049
 bye+0.047
 sorry+0.043
 Please+0.043
Neg logit contributions
 ease-0.050
zip-0.049
 stairs-0.048
 bank-0.048
 become-0.046
Token Rock
Token ID4631
Feature activation+4.37e+00

Pos logit contributions
 dear+0.141
 oh+0.138
 bye+0.138
 gust+0.134
Run+0.130
Neg logit contributions
ge-0.174
sey-0.169
nob-0.166
 bank-0.166
zip-0.165
Tokener
Token ID263
Feature activation+3.99e+00

Pos logit contributions
 oh+0.108
 dear+0.103
 Please+0.097
Run+0.097
 bye+0.095
Neg logit contributions
 ease-0.091
ws-0.087
zip-0.087
ge-0.085
nob-0.084
Token called
Token ID1444
Feature activation+3.51e+00

Pos logit contributions
 oh+0.084
 dear+0.082
 Please+0.077
Run+0.077
 bye+0.075
Neg logit contributions
 ease-0.095
zip-0.092
ge-0.091
nob-0.091
sey-0.089
Token.
Token ID13
Feature activation+1.96e+00

Pos logit contributions
 oh+0.082
 dear+0.079
Run+0.075
 Please+0.073
 bye+0.071
Neg logit contributions
 ease-0.079
ws-0.073
nob-0.073
zip-0.072
 stairs-0.071
Token
Token ID220
Feature activation-7.47e-01

Pos logit contributions
 oh+0.039
 dear+0.035
Run+0.034
 bye+0.032
 Please+0.032
Neg logit contributions
 ease-0.050
zip-0.048
 stairs-0.047
ws-0.046
nob-0.046
Token
Token ID198
Feature activation-1.39e+00

Pos logit contributions
 oh+0.032
 dear+0.030
Run+0.029
 gust+0.028
 Please+0.028
Neg logit contributions
 ease-0.023
ws-0.020
zip-0.020
 stairs-0.020
ips-0.019
Token
Token ID198
Feature activation+2.63e-01

Pos logit contributions
 ease+0.034
ws+0.033
 surfaces+0.032
ips+0.032
zip+0.031
Neg logit contributions
 oh-0.030
 dear-0.028
 Uncle-0.028
Run-0.027
 Please-0.027
Token"
Token ID1
Feature activation+7.49e-01

Pos logit contributions
 dear+0.006
 oh+0.006
 gust+0.006
 bye+0.006
 hello+0.006
Neg logit contributions
zip-0.006
 ease-0.006
ws-0.005
pack-0.005
ips-0.005
TokenSorry
Token ID14385
Feature activation+5.07e+00

Pos logit contributions
 oh+0.016
 dear+0.016
 Please+0.014
 Uncle+0.014
 bye+0.014
Neg logit contributions
 ease-0.018
ws-0.017
zip-0.017
ips-0.016
 surfaces-0.016
Token,
Token ID11
Feature activation+3.84e+00

Pos logit contributions
Run+0.112
 oh+0.108
 importantly+0.102
 gru+0.102
 gust+0.102
Neg logit contributions
 blocks-0.092
 build-0.091
 stairs-0.088
 surfaces-0.086
 knots-0.085
Token Farmer
Token ID37755
Feature activation+7.49e+00

Pos logit contributions
bye+0.063
over+0.063
 Please+0.059
 oh+0.055
 bye+0.054
Neg logit contributions
 build-0.088
 stairs-0.087
 blocks-0.085
 ramp-0.083
 ease-0.083
Token Joe
Token ID5689
Feature activation+3.35e+00

Pos logit contributions
 gust+0.124
 oh+0.116
Run+0.111
 bye+0.108
 dear+0.107
Neg logit contributions
sey-0.191
 stairs-0.183
 build-0.183
nob-0.183
ws-0.182
Token,"
Token ID553
Feature activation+3.50e+00

Pos logit contributions
 oh+0.082
Run+0.074
 dear+0.073
 gust+0.071
 Please+0.071
Neg logit contributions
ws-0.069
 ease-0.068
nob-0.064
 bank-0.064
ics-0.064
Token Ben
Token ID3932
Feature activation+5.63e-01

Pos logit contributions
 dear+0.056
 oh+0.055
 bye+0.053
 Please+0.052
 gust+0.050
Neg logit contributions
 ease-0.088
 build-0.087
 stairs-0.086
 blocks-0.084
 rebuild-0.082
Token says
Token ID1139
Feature activation+1.57e+00

Pos logit contributions
 oh+0.014
 dear+0.013
 gust+0.012
 Please+0.012
Run+0.012
Neg logit contributions
ws-0.012
 ease-0.011
nob-0.011
 bank-0.011
ips-0.011
Token.
Token ID13
Feature activation+3.38e+00

Pos logit contributions
 oh+0.044
 dear+0.042
Run+0.040
 Please+0.039
 goat+0.039
Neg logit contributions
 ease-0.028
ws-0.026
ge-0.026
ips-0.026
 rebuild-0.025
Token said
Token ID531
Feature activation+2.18e+00

Pos logit contributions
 oh+0.021
 dear+0.020
 Please+0.019
Run+0.019
 bye+0.018
Neg logit contributions
 ease-0.023
ws-0.022
zip-0.021
nob-0.020
 surfaces-0.020
Token,
Token ID11
Feature activation+2.52e+00

Pos logit contributions
Run+0.037
 oh+0.037
 dear+0.035
 goat+0.034
 Uncle+0.033
Neg logit contributions
 ease-0.059
ws-0.056
ics-0.055
nob-0.054
 stairs-0.054
Token "
Token ID366
Feature activation+1.02e+00

Pos logit contributions
 oh+0.044
 dear+0.041
Run+0.040
 gust+0.037
 Please+0.036
Neg logit contributions
 ease-0.068
ws-0.065
zip-0.064
 stairs-0.064
nob-0.064
TokenI
Token ID40
Feature activation-1.61e+00

Pos logit contributions
 oh+0.023
 dear+0.022
 Please+0.020
Run+0.020
 Uncle+0.020
Neg logit contributions
 ease-0.024
ws-0.024
zip-0.023
ips-0.022
 surfaces-0.022
Token am
Token ID716
Feature activation+1.97e+00

Pos logit contributions
 ease+0.028
ws+0.026
zip+0.025
ips+0.025
 surfaces+0.024
Neg logit contributions
 oh-0.047
 dear-0.045
 Please-0.043
Run-0.043
 Uncle-0.042
Token Nurse
Token ID42724
Feature activation+7.48e+00

Pos logit contributions
 oh+0.044
Run+0.041
 dear+0.041
 Please+0.038
 gust+0.038
Neg logit contributions
 ease-0.047
ws-0.044
zip-0.043
ips-0.042
 rebuild-0.041
Token Sam
Token ID3409
Feature activation+3.76e+00

Pos logit contributions
 gust+0.162
Run+0.162
 oh+0.159
 sorry+0.151
 gru+0.145
Neg logit contributions
nob-0.133
zag-0.132
 stairs-0.131
ers-0.130
sey-0.129
Token.
Token ID13
Feature activation+1.80e+00

Pos logit contributions
 oh+0.105
Run+0.103
 gust+0.099
 dear+0.097
 Please+0.094
Neg logit contributions
 ease-0.062
ws-0.062
nob-0.058
sey-0.057
zip-0.056
Token I
Token ID314
Feature activation-2.64e+00

Pos logit contributions
 oh+0.033
 gust+0.028
Run+0.028
 dear+0.028
 goat+0.026
Neg logit contributions
 ease-0.049
zip-0.048
ws-0.046
nob-0.046
 ramp-0.045
Token will
Token ID481
Feature activation-2.78e+00

Pos logit contributions
 ease+0.052
 surfaces+0.049
ips+0.048
 stairs+0.048
 driveway+0.047
Neg logit contributions
 dear-0.068
 oh-0.068
 Please-0.065
Run-0.063
 sorry-0.062
Token give
Token ID1577
Feature activation-4.88e-01

Pos logit contributions
ips+0.062
 ease+0.060
ws+0.059
 stairs+0.057
ge+0.057
Neg logit contributions
 dear-0.067
 oh-0.065
 Please-0.063
 sorry-0.061
 bye-0.061
Token"
Token ID1
Feature activation+3.72e-01

Pos logit contributions
 ease+0.016
ws+0.016
zip+0.015
ips+0.015
pack+0.015
Neg logit contributions
 oh-0.016
 dear-0.015
 Please-0.014
Run-0.014
 Uncle-0.013
TokenHello
Token ID15496
Feature activation+3.80e+00

Pos logit contributions
 oh+0.008
 dear+0.007
Run+0.007
 Please+0.007
 Uncle+0.006
Neg logit contributions
 ease-0.010
ws-0.010
ips-0.009
zip-0.009
 surfaces-0.009
Token,
Token ID11
Feature activation+3.49e+00

Pos logit contributions
 oh+0.071
Run+0.071
 gru+0.067
 Please+0.067
 gust+0.063
Neg logit contributions
zip-0.088
 stairs-0.088
 ease-0.086
 lands-0.086
ics-0.085
Token this
Token ID428
Feature activation+1.67e-03

Pos logit contributions
 oh+0.071
Run+0.067
 Please+0.064
 gru+0.063
 gust+0.062
Neg logit contributions
 stairs-0.079
 ease-0.078
zip-0.075
 build-0.075
 ramp-0.075
Token is
Token ID318
Feature activation+3.35e+00

Pos logit contributions
 oh+0.000
 dear+0.000
 Please+0.000
Run+0.000
 Uncle+0.000
Neg logit contributions
 ease-0.000
ws-0.000
zip-0.000
ips-0.000
 surfaces-0.000
Token Doctor
Token ID9356
Feature activation+7.45e+00

Pos logit contributions
Run+0.083
 gust+0.079
 dear+0.075
 gru+0.074
heart+0.073
Neg logit contributions
 ease-0.060
ws-0.058
zip-0.057
 become-0.057
 puzzles-0.056
Token Ben
Token ID3932
Feature activation+3.91e+00

Pos logit contributions
 gust+0.163
 oh+0.161
Run+0.159
 dear+0.151
 bye+0.146
Neg logit contributions
 stairs-0.144
ge-0.142
ws-0.141
ics-0.141
pack-0.139
Token.
Token ID13
Feature activation+3.44e+00

Pos logit contributions
 oh+0.095
 dear+0.090
Run+0.087
 gust+0.084
 Please+0.083
Neg logit contributions
 ease-0.082
ws-0.081
nob-0.077
zip-0.077
pack-0.077
Token Do
Token ID2141
Feature activation+1.28e+00

Pos logit contributions
 oh+0.064
 gust+0.059
 dear+0.055
Run+0.055
heart+0.051
Neg logit contributions
zip-0.088
 build-0.085
ws-0.085
 ease-0.085
 ramp-0.084
Token you
Token ID345
Feature activation-5.20e-01

Pos logit contributions
 oh+0.025
 dear+0.024
Run+0.024
 gust+0.023
 Uncle+0.023
Neg logit contributions
 ease-0.032
ws-0.030
zip-0.029
ics-0.029
 become-0.029
Token have
Token ID423
Feature activation+1.22e-01

Pos logit contributions
 ease+0.014
ws+0.014
zip+0.013
nob+0.013
pack+0.013
Neg logit contributions
 oh-0.011
 dear-0.010
Run-0.009
 Uncle-0.009
 Please-0.009
Token what
Token ID644
Feature activation+5.57e-01

Pos logit contributions
 oh+0.054
Run+0.050
 dear+0.050
 Please+0.049
 gust+0.046
Neg logit contributions
 ease-0.071
ws-0.067
zip-0.064
 stairs-0.064
nob-0.064
Token this
Token ID428
Feature activation-1.38e+00

Pos logit contributions
 oh+0.011
 dear+0.010
Run+0.010
 gust+0.010
 bye+0.010
Neg logit contributions
 ease-0.014
 stairs-0.013
ws-0.013
nob-0.013
zip-0.013
Token mineral
Token ID18352
Feature activation-1.59e+00

Pos logit contributions
 ease+0.029
ws+0.027
zip+0.026
ips+0.026
nob+0.025
Neg logit contributions
 oh-0.035
 dear-0.034
 Please-0.032
Run-0.032
 Uncle-0.031
Token is
Token ID318
Feature activation+1.58e+00

Pos logit contributions
 ease+0.032
ws+0.031
zip+0.029
ips+0.029
 surfaces+0.028
Neg logit contributions
 oh-0.041
 dear-0.040
 Please-0.038
Run-0.037
 Uncle-0.036
Token?"
Token ID1701
Feature activation+1.97e+00

Pos logit contributions
 oh+0.031
 dear+0.028
Run+0.028
 Please+0.027
 bye+0.026
Neg logit contributions
 ease-0.042
ws-0.041
nob-0.039
pack-0.039
zip-0.039
Token Doctor
Token ID9356
Feature activation+7.32e+00

Pos logit contributions
 dear+0.042
 oh+0.042
 bye+0.039
 Please+0.036
Run+0.036
Neg logit contributions
 ease-0.045
zip-0.042
 stairs-0.041
 bank-0.040
nob-0.040
Token Rock
Token ID4631
Feature activation+4.12e+00

Pos logit contributions
 dear+0.147
 bye+0.144
 oh+0.139
Run+0.134
 gust+0.134
Neg logit contributions
ge-0.167
sey-0.161
zip-0.155
 ease-0.155
nob-0.153
Tokener
Token ID263
Feature activation+3.40e+00

Pos logit contributions
 oh+0.105
 dear+0.101
 Please+0.095
Run+0.095
 bye+0.093
Neg logit contributions
 ease-0.083
ws-0.079
zip-0.078
ge-0.077
ips-0.076
Token asked
Token ID1965
Feature activation+2.26e+00

Pos logit contributions
 oh+0.072
 dear+0.072
 Please+0.067
Run+0.067
 bye+0.066
Neg logit contributions
 ease-0.081
ge-0.078
zip-0.077
nob-0.076
sey-0.074
Token.
Token ID13
Feature activation+1.20e+00

Pos logit contributions
 oh+0.057
 dear+0.056
Run+0.054
 Please+0.053
 bye+0.053
Neg logit contributions
 ease-0.045
ge-0.041
nob-0.041
zip-0.040
ws-0.039
Token
Token ID220
Feature activation-1.22e+00

Pos logit contributions
 oh+0.033
 dear+0.031
Run+0.029
 Please+0.029
 bye+0.029
Neg logit contributions
 ease-0.023
zip-0.021
ws-0.020
 stairs-0.020
nob-0.020
Token"
Token ID1
Feature activation+1.45e+00

Pos logit contributions
 ease+0.005
zip+0.005
ws+0.005
nob+0.005
pack+0.004
Neg logit contributions
 oh-0.005
 dear-0.005
 Please-0.005
Run-0.005
 Uncle-0.005
TokenAre
Token ID8491
Feature activation+1.74e+00

Pos logit contributions
 dear+0.032
 oh+0.032
 Please+0.029
 bye+0.029
 Uncle+0.028
Neg logit contributions
 ease-0.034
zip-0.032
ws-0.032
nob-0.031
 surfaces-0.031
Token you
Token ID345
Feature activation+1.17e+00

Pos logit contributions
 oh+0.035
 dear+0.034
Run+0.033
 gru+0.033
 Uncle+0.033
Neg logit contributions
 ease-0.042
ips-0.038
ws-0.038
zip-0.037
ics-0.037
Token okay
Token ID8788
Feature activation+4.95e+00

Pos logit contributions
 oh+0.013
 dear+0.012
 Please+0.011
Run+0.011
 Uncle+0.010
Neg logit contributions
 ease-0.041
ws-0.040
zip-0.039
ips-0.038
pack-0.038
Token,
Token ID11
Feature activation+4.36e+00

Pos logit contributions
Run+0.104
 oh+0.102
 gru+0.091
 dear+0.090
 Please+0.085
Neg logit contributions
zag-0.108
ge-0.107
ics-0.106
sey-0.105
 build-0.104
Token Princess
Token ID8449
Feature activation+7.26e+00

Pos logit contributions
 oh+0.105
Run+0.105
 gru+0.099
 Please+0.095
ule+0.091
Neg logit contributions
 ease-0.078
 stairs-0.078
 tops-0.077
 blocks-0.077
 puzzles-0.077
Token Lily
Token ID20037
Feature activation+4.36e+00

Pos logit contributions
Run+0.193
 oh+0.175
 gust+0.173
 gru+0.171
 inqu+0.165
Neg logit contributions
 parties-0.109
rap-0.108
 anywhere-0.106
 find-0.105
pack-0.105
Token?"
Token ID1701
Feature activation+2.24e+00

Pos logit contributions
 oh+0.112
 dear+0.108
Run+0.105
 gust+0.099
 bye+0.097
Neg logit contributions
ws-0.083
nob-0.082
pack-0.080
zip-0.078
 build-0.076
Token asked
Token ID1965
Feature activation+1.69e+00

Pos logit contributions
 dear+0.061
 oh+0.060
Run+0.058
heart+0.056
ww+0.055
Neg logit contributions
 ease-0.035
 become-0.033
 find-0.033
 bank-0.033
nob-0.032
Token the
Token ID262
Feature activation+1.22e+00

Pos logit contributions
 oh+0.046
 dear+0.046
Run+0.044
 Please+0.041
 bye+0.041
Neg logit contributions
 ease-0.032
zip-0.027
nob-0.027
ws-0.027
sey-0.026
Token servant
Token ID18807
Feature activation+2.01e+00

Pos logit contributions
 oh+0.027
 dear+0.025
Run+0.024
 Please+0.023
 Uncle+0.023
Neg logit contributions
 ease-0.030
 stairs-0.028
ws-0.028
zip-0.028
 surfaces-0.027
Token
Token ID95
Feature activation+2.23e+00

Pos logit contributions
Run+0.045
 oh+0.040
 goat+0.040
 gust+0.039
 dear+0.039
Neg logit contributions
 ease-0.054
 neatly-0.051
 regularly-0.049
 aside-0.049
 tops-0.049
Token
Token ID26391
Feature activation+4.76e+00

Pos logit contributions
 oh+0.048
 dear+0.045
 Please+0.042
Run+0.042
 Uncle+0.041
Neg logit contributions
 ease-0.056
ws-0.054
zip-0.052
ips-0.051
nob-0.051
Token
Token ID129
Feature activation+2.72e+00

Pos logit contributions
 oh+0.131
 dear+0.124
Run+0.120
 Please+0.116
 Uncle+0.115
Neg logit contributions
 ease-0.088
zip-0.082
ws-0.082
ips-0.080
nob-0.078
Token
Token ID241
Feature activation+2.04e+00

Pos logit contributions
 oh+0.065
 dear+0.061
 Please+0.058
Run+0.058
 Uncle+0.056
Neg logit contributions
 ease-0.061
ws-0.059
zip-0.057
ips-0.056
nob-0.055
TokenHello
Token ID15496
Feature activation+4.35e+00

Pos logit contributions
 oh+0.046
 dear+0.044
 Please+0.041
Run+0.041
 Uncle+0.040
Neg logit contributions
 ease-0.048
ws-0.046
zip-0.045
ips-0.044
 surfaces-0.043
Token Princess
Token ID8449
Feature activation+7.16e+00

Pos logit contributions
 oh+0.084
Run+0.079
 dear+0.074
 Please+0.072
 gust+0.072
Neg logit contributions
 ease-0.102
zip-0.101
 stairs-0.099
ws-0.099
ips-0.098
Token!
Token ID0
Feature activation+3.09e+00

Pos logit contributions
 oh+0.153
Run+0.151
 gust+0.146
 dear+0.136
 gru+0.133
Neg logit contributions
zip-0.136
zag-0.136
ips-0.135
 stairs-0.135
ws-0.134
Token I
Token ID314
Feature activation-5.05e-01

Pos logit contributions
 oh+0.058
 dear+0.055
 gust+0.050
Run+0.050
 goat+0.048
Neg logit contributions
 ease-0.082
zip-0.079
nob-0.076
ws-0.075
ips-0.075
Token am
Token ID716
Feature activation+2.60e+00

Pos logit contributions
 ease+0.010
ws+0.010
ips+0.009
 surfaces+0.009
zip+0.009
Neg logit contributions
 oh-0.013
 dear-0.012
 Please-0.012
Run-0.012
 bye-0.011
Token here
Token ID994
Feature activation+2.54e+00

Pos logit contributions
 oh+0.055
 dear+0.052
Run+0.052
 Please+0.048
 bye+0.048
Neg logit contributions
 ease-0.064
ws-0.060
zip-0.058
ips-0.058
nob-0.057
Token to
Token ID284
Feature activation-2.00e+00

Pos logit contributions
 oh+0.058
 dear+0.054
Run+0.053
 Please+0.050
 gust+0.050
Neg logit contributions
 ease-0.058
ws-0.055
zip-0.053
pack-0.053
ips-0.053
Token can
Token ID460
Feature activation+7.26e-01

Pos logit contributions
 ease+0.002
ws+0.002
ics+0.002
zip+0.002
nob+0.002
Neg logit contributions
 oh-0.003
 dear-0.003
Run-0.003
 gust-0.003
 Please-0.003
Token we
Token ID356
Feature activation-1.77e+00

Pos logit contributions
 oh+0.011
 dear+0.010
Run+0.010
 bye+0.009
 Uncle+0.009
Neg logit contributions
 ease-0.022
ws-0.020
zip-0.020
 stairs-0.020
 surfaces-0.020
Token do
Token ID466
Feature activation+1.17e+00

Pos logit contributions
 ease+0.034
ws+0.033
ips+0.032
zip+0.032
nob+0.031
Neg logit contributions
 oh-0.047
 dear-0.046
 Please-0.043
Run-0.042
 sorry-0.042
Token?"
Token ID1701
Feature activation+2.74e+00

Pos logit contributions
 oh+0.023
 dear+0.022
Run+0.021
 Please+0.020
 bye+0.020
Neg logit contributions
 ease-0.031
ws-0.030
zip-0.029
pack-0.028
ips-0.028
Token asked
Token ID1965
Feature activation+2.47e+00

Pos logit contributions
 dear+0.060
 oh+0.059
 bye+0.057
Run+0.054
 goat+0.054
Neg logit contributions
 ease-0.061
zip-0.054
 stairs-0.054
 build-0.053
 bank-0.053
Token Little
Token ID7703
Feature activation+7.12e+00

Pos logit contributions
 dear+0.046
 oh+0.044
Run+0.042
 bye+0.042
 Please+0.042
Neg logit contributions
 ease-0.067
zip-0.060
sey-0.060
 neatly-0.059
pack-0.058
Token Jack
Token ID3619
Feature activation+2.79e+00

Pos logit contributions
 dear+0.132
 oh+0.130
 bye+0.124
 gust+0.121
Run+0.118
Neg logit contributions
nob-0.164
sey-0.161
 stairs-0.161
 ease-0.156
 build-0.155
Token.
Token ID13
Feature activation+1.38e+00

Pos logit contributions
 oh+0.067
 dear+0.064
Run+0.061
 Please+0.060
 bye+0.058
Neg logit contributions
 ease-0.062
ws-0.059
zip-0.057
nob-0.056
ge-0.056
Token
Token ID220
Feature activation-5.67e-01

Pos logit contributions
 oh+0.041
 dear+0.040
Run+0.037
 bye+0.037
 gust+0.036
Neg logit contributions
 ease-0.022
zip-0.020
 stairs-0.019
nob-0.019
ws-0.019
Token
Token ID198
Feature activation-4.56e-01

Pos logit contributions
ws+0.014
 ease+0.013
zip+0.013
 surfaces+0.013
nob+0.013
Neg logit contributions
 oh-0.012
 dear-0.012
 Uncle-0.011
 Please-0.011
 sorry-0.011
Token
Token ID198
Feature activation+6.29e-01

Pos logit contributions
 ease+0.010
ws+0.010
zip+0.009
ips+0.009
 surfaces+0.009
Neg logit contributions
 oh-0.011
 dear-0.011
 Please-0.010
Run-0.010
 Uncle-0.010
Token you
Token ID345
Feature activation+2.46e+00

Pos logit contributions
 Uncle+0.024
 dear+0.024
 oh+0.023
 gust+0.023
 goat+0.023
Neg logit contributions
ips-0.021
 ease-0.020
zip-0.020
ws-0.019
pack-0.019
Token for
Token ID329
Feature activation-1.52e+00

Pos logit contributions
Run+0.055
 oh+0.053
 Please+0.046
 hello+0.046
 goat+0.046
Neg logit contributions
 ease-0.053
 neatly-0.050
 stairs-0.050
 bank-0.050
 halves-0.050
Token serving
Token ID7351
Feature activation-2.35e+00

Pos logit contributions
 ease+0.041
ws+0.039
zip+0.038
ips+0.038
 surfaces+0.037
Neg logit contributions
 oh-0.031
 dear-0.028
 Please-0.026
Run-0.026
 bye-0.026
Token me
Token ID502
Feature activation-1.21e-01

Pos logit contributions
 ease+0.058
ws+0.055
ips+0.055
zip+0.054
ge+0.054
Neg logit contributions
 oh-0.052
 dear-0.049
 Please-0.046
 Uncle-0.045
 bye-0.044
Token,
Token ID11
Feature activation+3.47e+00

Pos logit contributions
 ease+0.003
ws+0.003
zip+0.003
ips+0.003
 surfaces+0.003
Neg logit contributions
 oh-0.002
 dear-0.002
 Please-0.002
Run-0.002
 Uncle-0.002
Token Doctor
Token ID9356
Feature activation+7.11e+00

Pos logit contributions
 oh+0.070
Run+0.066
 dear+0.060
 Please+0.060
 Uncle+0.059
Neg logit contributions
 ease-0.086
ws-0.081
 stairs-0.080
 halves-0.078
zip-0.077
Token,"
Token ID553
Feature activation+1.73e+00

Pos logit contributions
 oh+0.164
Run+0.164
 gust+0.162
 dear+0.149
 gru+0.145
Neg logit contributions
ws-0.135
ge-0.130
 ease-0.127
 stairs-0.126
 bank-0.125
Token said
Token ID531
Feature activation+3.08e+00

Pos logit contributions
 dear+0.032
 oh+0.032
 bye+0.031
Run+0.030
 Please+0.030
Neg logit contributions
 ease-0.039
 bank-0.038
zip-0.037
 build-0.037
 halves-0.037
Token Lily
Token ID20037
Feature activation+1.60e+00

Pos logit contributions
 oh+0.071
 dear+0.070
Run+0.069
 Please+0.065
 bye+0.064
Neg logit contributions
 ease-0.066
ws-0.059
zip-0.059
 rebuild-0.059
 bank-0.058
Token.
Token ID13
Feature activation+9.34e-01

Pos logit contributions
 oh+0.036
 dear+0.034
 Please+0.032
Run+0.031
 Uncle+0.031
Neg logit contributions
 ease-0.038
ws-0.037
zip-0.035
ips-0.035
 surfaces-0.035
Token
Token ID198
Feature activation-3.10e+00

Pos logit contributions
 oh+0.019
 dear+0.018
 Please+0.017
Run+0.016
 Uncle+0.016
Neg logit contributions
 ease-0.025
ws-0.024
zip-0.023
ips-0.023
 surfaces-0.022
Token"
Token ID1
Feature activation+1.16e+00

Pos logit contributions
 ease+0.015
ws+0.015
zip+0.015
ips+0.014
pack+0.014
Neg logit contributions
 oh-0.014
 dear-0.013
Run-0.012
 Please-0.012
 Uncle-0.012
TokenHello
Token ID15496
Feature activation+3.84e+00

Pos logit contributions
 oh+0.028
Run+0.025
 Please+0.025
 dear+0.024
 bye+0.023
Neg logit contributions
 ease-0.027
ws-0.026
ge-0.025
nob-0.025
ips-0.024
Token,
Token ID11
Feature activation+3.14e+00

Pos logit contributions
 oh+0.068
Run+0.064
 dear+0.060
 gru+0.060
 Please+0.059
Neg logit contributions
 ease-0.100
ws-0.098
zip-0.098
ips-0.096
 stairs-0.096
Token I
Token ID314
Feature activation-7.08e-01

Pos logit contributions
 oh+0.060
Run+0.054
 dear+0.052
 Please+0.052
 Uncle+0.051
Neg logit contributions
 ease-0.082
ws-0.078
zip-0.078
 stairs-0.076
 ramp-0.076
Token am
Token ID716
Feature activation+3.48e+00

Pos logit contributions
 ease+0.015
 surfaces+0.014
ws+0.014
 stairs+0.014
ics+0.013
Neg logit contributions
 oh-0.018
 dear-0.017
 Please-0.016
Run-0.016
 bye-0.016
Token Princess
Token ID8449
Feature activation+7.09e+00

Pos logit contributions
Run+0.070
 oh+0.068
 dear+0.067
heart+0.065
 gust+0.062
Neg logit contributions
 ease-0.083
zip-0.077
ips-0.076
ws-0.076
 ramp-0.074
Token Purple
Token ID17265
Feature activation+2.10e+00

Pos logit contributions
 gust+0.162
Run+0.158
 oh+0.153
 dear+0.150
 bye+0.144
Neg logit contributions
ge-0.125
 prizes-0.121
zip-0.121
 ramp-0.119
 stairs-0.118
Token!"
Token ID2474
Feature activation+1.43e+00

Pos logit contributions
 oh+0.047
 dear+0.045
 Please+0.043
Run+0.041
 bye+0.041
Neg logit contributions
 ease-0.051
ws-0.048
zip-0.046
 surfaces-0.046
ips-0.045
Token Lily
Token ID20037
Feature activation+1.61e+00

Pos logit contributions
 oh+0.034
 dear+0.031
Run+0.030
 goat+0.029
 bye+0.028
Neg logit contributions
 ease-0.032
 stairs-0.029
zip-0.029
ws-0.029
 become-0.028
Token says
Token ID1139
Feature activation+2.14e+00

Pos logit contributions
 oh+0.033
 dear+0.031
Run+0.029
 Please+0.029
 Uncle+0.028
Neg logit contributions
 ease-0.042
ws-0.041
zip-0.039
ips-0.038
nob-0.038
Token as
Token ID355
Feature activation+3.37e+00

Pos logit contributions
 oh+0.060
Run+0.059
 dear+0.058
 goat+0.056
 Please+0.054
Neg logit contributions
 ease-0.035
 aside-0.034
 neatly-0.034
ge-0.034
nob-0.033
Token at
Token ID379
Feature activation+1.24e-01

Pos logit contributions
Run+0.068
 oh+0.067
 gru+0.059
 dear+0.058
 gust+0.058
Neg logit contributions
 ease-0.083
ws-0.080
pack-0.079
 neatly-0.079
zip-0.078
Token the
Token ID262
Feature activation+1.89e+00

Pos logit contributions
 oh+0.003
 dear+0.003
 Uncle+0.002
 Please+0.002
Run+0.002
Neg logit contributions
 ease-0.003
ws-0.003
ips-0.003
zip-0.003
ge-0.003
Token rain
Token ID6290
Feature activation+1.91e+00

Pos logit contributions
 oh+0.048
 dear+0.046
 Please+0.043
Run+0.043
 Uncle+0.042
Neg logit contributions
 ease-0.039
ws-0.038
zip-0.036
ips-0.035
 surfaces-0.035
Token!"
Token ID2474
Feature activation+2.32e+00

Pos logit contributions
 oh+0.043
 dear+0.041
 Please+0.039
Run+0.038
 Uncle+0.037
Neg logit contributions
 ease-0.046
ws-0.044
zip-0.042
ips-0.041
nob-0.040
Token shouted
Token ID17293
Feature activation+4.17e+00

Pos logit contributions
 oh+0.042
 dear+0.042
Run+0.042
 bye+0.041
 gust+0.041
Neg logit contributions
 ease-0.053
 bank-0.052
 stairs-0.051
 neatly-0.051
zip-0.051
Token Little
Token ID7703
Feature activation+7.01e+00

Pos logit contributions
Run+0.080
 oh+0.077
 dear+0.076
 importantly+0.074
 Please+0.069
Neg logit contributions
 ease-0.102
 build-0.097
 bank-0.096
nob-0.096
 rebuild-0.095
Token Matt
Token ID4705
Feature activation+3.33e+00

Pos logit contributions
 dear+0.125
 gust+0.122
 oh+0.121
 sorry+0.118
Run+0.117
Neg logit contributions
nob-0.172
sey-0.170
 ease-0.166
zip-0.165
pack-0.165
Token,
Token ID11
Feature activation+3.17e+00

Pos logit contributions
 oh+0.075
 dear+0.071
 Please+0.067
Run+0.066
 Uncle+0.065
Neg logit contributions
 ease-0.079
ws-0.077
zip-0.073
ips-0.073
 surfaces-0.072
Token excited
Token ID6568
Feature activation+3.70e+00

Pos logit contributions
 dear+0.075
 oh+0.074
 Please+0.069
 bye+0.068
 Uncle+0.067
Neg logit contributions
 ease-0.067
ge-0.060
 stairs-0.060
zip-0.060
 surfaces-0.060
Tokenly
Token ID306
Feature activation+3.86e+00

Pos logit contributions
Run+0.098
 oh+0.094
 dear+0.088
 sorry+0.086
 gust+0.085
Neg logit contributions
 surfaces-0.063
pack-0.062
 become-0.060
nob-0.059
 stairs-0.059
Token pointing
Token ID10609
Feature activation+7.87e-01

Pos logit contributions
 oh+0.090
 dear+0.087
Run+0.086
 Please+0.083
 bye+0.082
Neg logit contributions
 ease-0.079
 bank-0.077
ge-0.076
pack-0.074
nob-0.074
Token we
Token ID356
Feature activation+7.76e-01

Pos logit contributions
 oh+0.068
Run+0.065
 gru+0.063
 dear+0.062
 gust+0.061
Neg logit contributions
 ease-0.071
 stairs-0.067
ics-0.067
 halves-0.063
 tops-0.063
Token going
Token ID1016
Feature activation-7.66e-01

Pos logit contributions
 oh+0.018
 dear+0.017
Run+0.016
 Please+0.016
 Uncle+0.015
Neg logit contributions
 ease-0.018
ws-0.017
zip-0.017
ips-0.016
pack-0.016
Token today
Token ID1909
Feature activation+1.41e+00

Pos logit contributions
 ease+0.019
ws+0.018
 surfaces+0.017
zip+0.017
nob+0.017
Neg logit contributions
 oh-0.017
 dear-0.016
 Please-0.016
 Uncle-0.015
 bye-0.015
Token?"
Token ID1701
Feature activation+2.58e+00

Pos logit contributions
 oh+0.026
 dear+0.024
Run+0.023
 Please+0.022
 Uncle+0.022
Neg logit contributions
 ease-0.039
ws-0.038
zip-0.037
ips-0.036
pack-0.036
Token asked
Token ID1965
Feature activation+1.47e+00

Pos logit contributions
 dear+0.055
 reckless+0.053
 cruel+0.052
 oh+0.052
 gru+0.049
Neg logit contributions
 ease-0.052
 bank-0.051
 become-0.048
 build-0.048
 stairs-0.048
Token Little
Token ID7703
Feature activation+6.98e+00

Pos logit contributions
 oh+0.029
 dear+0.029
 Please+0.028
Run+0.027
 gust+0.027
Neg logit contributions
 ease-0.037
zip-0.033
sey-0.033
ws-0.032
pack-0.032
Token Sarah
Token ID10490
Feature activation+2.75e+00

Pos logit contributions
 dear+0.131
 sorry+0.130
 oh+0.129
 gust+0.129
 bye+0.123
Neg logit contributions
nob-0.165
 stairs-0.160
sey-0.156
zip-0.155
 ramp-0.154
Token.
Token ID13
Feature activation+4.43e-01

Pos logit contributions
 oh+0.060
 dear+0.057
 Please+0.054
 Uncle+0.053
Run+0.052
Neg logit contributions
 ease-0.067
ws-0.066
ips-0.063
zip-0.062
 surfaces-0.061
Token She
Token ID1375
Feature activation+1.71e+00

Pos logit contributions
 oh+0.009
 dear+0.009
Run+0.008
 Please+0.008
 sorry+0.008
Neg logit contributions
 ease-0.011
zip-0.011
 driveway-0.010
 stairs-0.010
ws-0.010
Token was
Token ID373
Feature activation+2.26e+00

Pos logit contributions
 oh+0.043
 dear+0.040
Run+0.039
 bye+0.037
 Please+0.037
Neg logit contributions
 ease-0.035
zip-0.033
pack-0.032
ws-0.032
 stairs-0.032
Token hopping
Token ID47153
Feature activation+6.33e-02

Pos logit contributions
 oh+0.046
 dear+0.044
Run+0.042
 Please+0.040
 bye+0.040
Neg logit contributions
 ease-0.057
ws-0.054
zip-0.053
pack-0.053
ips-0.053
Token"
Token ID1
Feature activation+1.42e+00

Pos logit contributions
 oh+0.009
 dear+0.009
 Please+0.008
Run+0.008
 Uncle+0.008
Neg logit contributions
 ease-0.009
ws-0.008
zip-0.008
ips-0.008
 surfaces-0.008
TokenHelp
Token ID22087
Feature activation+2.93e+00

Pos logit contributions
 oh+0.029
 dear+0.029
 bye+0.026
 Please+0.026
 Uncle+0.025
Neg logit contributions
 ease-0.036
zip-0.034
ws-0.034
nob-0.033
ips-0.033
Token!
Token ID0
Feature activation+2.85e+00

Pos logit contributions
 dear+0.068
 oh+0.065
 importantly+0.063
 Please+0.063
Run+0.062
Neg logit contributions
 ease-0.062
nob-0.058
ws-0.056
zip-0.055
 aside-0.055
Token Help
Token ID10478
Feature activation+7.52e-01

Pos logit contributions
 oh+0.059
 dear+0.056
 gust+0.053
Run+0.052
 bye+0.051
Neg logit contributions
 ease-0.067
zip-0.066
nob-0.065
pack-0.063
ws-0.063
Token!"
Token ID2474
Feature activation+2.18e+00

Pos logit contributions
 oh+0.016
 dear+0.015
 Please+0.014
Run+0.014
 Uncle+0.013
Neg logit contributions
 ease-0.020
ws-0.019
zip-0.018
ips-0.018
pack-0.018
Token Little
Token ID7703
Feature activation+6.96e+00

Pos logit contributions
 oh+0.044
 dear+0.042
 bye+0.039
Run+0.039
 goat+0.038
Neg logit contributions
 ease-0.052
 stairs-0.048
 bank-0.048
zip-0.047
nob-0.047
Token Annie
Token ID25860
Feature activation+2.51e+00

Pos logit contributions
 oh+0.124
 dear+0.120
 sorry+0.114
 importantly+0.113
 gust+0.112
Neg logit contributions
sey-0.173
nob-0.171
 ease-0.167
 stairs-0.167
zip-0.165
Token shouted
Token ID17293
Feature activation+3.04e+00

Pos logit contributions
 oh+0.056
 dear+0.055
 Please+0.050
 bye+0.049
 Can+0.049
Neg logit contributions
 ease-0.058
ws-0.054
zip-0.053
 bank-0.053
nob-0.053
Token from
Token ID422
Feature activation+2.13e+00

Pos logit contributions
 oh+0.084
 dear+0.084
Run+0.082
 Can+0.078
 goat+0.077
Neg logit contributions
 ease-0.050
 aside-0.048
nob-0.047
ge-0.046
 apart-0.045
Token the
Token ID262
Feature activation+5.82e-01

Pos logit contributions
 oh+0.054
 dear+0.053
 Please+0.050
Run+0.048
 bye+0.048
Neg logit contributions
 ease-0.045
ws-0.041
 stairs-0.039
zip-0.039
nob-0.039
Token top
Token ID1353
Feature activation-5.00e-01

Pos logit contributions
 oh+0.014
 dear+0.013
 Please+0.013
Run+0.013
 Uncle+0.012
Neg logit contributions
 ease-0.013
ws-0.013
zip-0.012
ips-0.012
nob-0.012
Token friend
Token ID1545
Feature activation+2.22e+00

Pos logit contributions
 ease+0.013
ws+0.013
zip+0.012
ips+0.012
pack+0.012
Neg logit contributions
 oh-0.010
 dear-0.009
Run-0.009
 Please-0.009
 gust-0.008
Token said
Token ID531
Feature activation+1.99e+00

Pos logit contributions
 oh+0.047
 dear+0.046
Run+0.045
 Please+0.044
 gust+0.042
Neg logit contributions
 ease-0.051
zip-0.047
 bank-0.046
 build-0.046
nob-0.046
Token,
Token ID11
Feature activation+3.89e+00

Pos logit contributions
Run+0.047
heart+0.044
 dear+0.044
 oh+0.042
 goat+0.042
Neg logit contributions
 ease-0.038
ics-0.037
 aside-0.036
 off-0.035
 build-0.035
Token "
Token ID366
Feature activation+2.62e-01

Pos logit contributions
heart+0.098
 dear+0.098
Run+0.091
 importantly+0.089
 oh+0.087
Neg logit contributions
 ease-0.062
 stairs-0.062
 build-0.061
 aside-0.058
nob-0.058
TokenHi
Token ID17250
Feature activation+3.55e+00

Pos logit contributions
 dear+0.006
 oh+0.006
 sorry+0.005
 gust+0.005
 Uncle+0.005
Neg logit contributions
 ease-0.006
zip-0.006
ws-0.006
nob-0.006
pack-0.006
Token Farmer
Token ID37755
Feature activation+6.96e+00

Pos logit contributions
 oh+0.070
 gru+0.068
Run+0.067
 Please+0.065
 gust+0.064
Neg logit contributions
ics-0.076
 ease-0.075
 stairs-0.073
zip-0.072
 anywhere-0.072
Token!
Token ID0
Feature activation+3.23e+00

Pos logit contributions
 gust+0.147
Run+0.143
 oh+0.141
 sorry+0.130
 Please+0.127
Neg logit contributions
sey-0.149
nob-0.145
 ease-0.141
 build-0.141
zag-0.140
Token I
Token ID314
Feature activation-1.18e+00

Pos logit contributions
 oh+0.062
 dear+0.059
 gust+0.059
Run+0.053
 goat+0.053
Neg logit contributions
zip-0.077
 ease-0.074
|-0.074
nob-0.072
 build-0.072
Token have
Token ID423
Feature activation+7.42e-01

Pos logit contributions
 ease+0.024
ws+0.022
zip+0.021
ips+0.021
 surfaces+0.021
Neg logit contributions
 oh-0.031
 dear-0.030
 Please-0.028
Run-0.028
 Uncle-0.027
Token some
Token ID617
Feature activation-1.44e+00

Pos logit contributions
 oh+0.016
 dear+0.015
 Please+0.014
Run+0.014
 Uncle+0.013
Neg logit contributions
 ease-0.019
ws-0.018
zip-0.018
ips-0.017
 surfaces-0.017
Token hay
Token ID27678
Feature activation-1.08e+00

Pos logit contributions
 ease+0.029
ws+0.028
zip+0.027
ips+0.026
 surfaces+0.026
Neg logit contributions
 oh-0.038
 dear-0.036
 Please-0.034
Run-0.034
 Uncle-0.033
Token.
Token ID13
Feature activation+3.14e-01

Pos logit contributions
 oh+0.036
 dear+0.035
Run+0.034
 Please+0.033
 bye+0.032
Neg logit contributions
 ease-0.031
ws-0.028
zip-0.028
nob-0.028
ips-0.027
Token "
Token ID366
Feature activation+1.04e+00

Pos logit contributions
 oh+0.007
 dear+0.007
Run+0.006
 Please+0.006
 bye+0.006
Neg logit contributions
 ease-0.007
zip-0.007
 stairs-0.007
ws-0.007
 driveway-0.007
TokenHello
Token ID15496
Feature activation+5.16e+00

Pos logit contributions
 dear+0.025
 oh+0.025
 sorry+0.023
 Uncle+0.022
 bye+0.022
Neg logit contributions
 ease-0.023
zip-0.022
ws-0.021
nob-0.021
pack-0.020
Token,
Token ID11
Feature activation+4.54e+00

Pos logit contributions
 oh+0.107
Run+0.104
 gru+0.101
 Please+0.099
 sorry+0.099
Neg logit contributions
 ease-0.103
 stairs-0.103
zip-0.102
 blocks-0.099
ics-0.099
Token my
Token ID616
Feature activation+3.54e+00

Pos logit contributions
 oh+0.088
Run+0.088
 gru+0.081
 Please+0.081
 goat+0.074
Neg logit contributions
 ease-0.105
 stairs-0.100
zip-0.097
 become-0.095
 tops-0.095
Token darling
Token ID40003
Feature activation+6.90e+00

Pos logit contributions
 oh+0.068
Run+0.068
 Please+0.062
 gru+0.060
 gust+0.057
Neg logit contributions
 ease-0.084
sey-0.082
 stairs-0.080
 tops-0.080
 blocks-0.079
Token!"
Token ID2474
Feature activation+2.66e+00

Pos logit contributions
 oh+0.175
Run+0.170
 gust+0.158
 dear+0.158
 bye+0.156
Neg logit contributions
 ease-0.122
ge-0.121
sey-0.121
nob-0.118
 stairs-0.118
Token he
Token ID339
Feature activation+2.84e+00

Pos logit contributions
 oh+0.058
 dear+0.056
Run+0.054
 bye+0.051
 gust+0.051
Neg logit contributions
 stairs-0.057
 ease-0.055
zip-0.053
 neatly-0.052
 surfaces-0.052
Token said
Token ID531
Feature activation+3.00e+00

Pos logit contributions
 oh+0.060
 dear+0.059
Run+0.054
 bye+0.053
 Can+0.052
Neg logit contributions
zip-0.062
 ease-0.061
nob-0.060
 become-0.060
sey-0.059
Token,
Token ID11
Feature activation+4.32e+00

Pos logit contributions
Run+0.085
 oh+0.082
 dear+0.081
 goat+0.080
ww+0.076
Neg logit contributions
nob-0.046
 ease-0.046
 neatly-0.044
 aside-0.043
 rebuild-0.042
Token smiling
Token ID16755
Feature activation+4.16e+00

Pos logit contributions
 oh+0.084
 dear+0.084
Run+0.081
 bye+0.074
heart+0.074
Neg logit contributions
 ease-0.098
nob-0.095
 stairs-0.094
 neatly-0.093
 bank-0.093
Token"
Token ID1
Feature activation+1.29e+00

Pos logit contributions
 ease+0.003
zip+0.003
ws+0.003
pack+0.003
ips+0.003
Neg logit contributions
 oh-0.003
 dear-0.003
Run-0.003
 gust-0.003
 Please-0.003
TokenHello
Token ID15496
Feature activation+4.59e+00

Pos logit contributions
 dear+0.028
 oh+0.028
 Uncle+0.025
 bye+0.025
 sorry+0.025
Neg logit contributions
 ease-0.031
zip-0.030
ws-0.029
 surfaces-0.028
 stairs-0.028
Token,
Token ID11
Feature activation+4.01e+00

Pos logit contributions
 oh+0.089
Run+0.080
 gust+0.076
 dear+0.076
 gru+0.074
Neg logit contributions
 ease-0.112
 stairs-0.108
ws-0.108
ge-0.106
zip-0.105
Token I
Token ID314
Feature activation-6.43e-01

Pos logit contributions
 oh+0.083
Run+0.076
 Please+0.072
 gust+0.071
 dear+0.069
Neg logit contributions
 ease-0.096
 stairs-0.089
ws-0.088
 ramp-0.087
zip-0.086
Token am
Token ID716
Feature activation+3.50e+00

Pos logit contributions
 ease+0.014
ws+0.013
 surfaces+0.013
ips+0.013
zip+0.013
Neg logit contributions
 oh-0.016
 dear-0.015
 Please-0.014
Run-0.014
 Uncle-0.014
Token Miss
Token ID4544
Feature activation+6.82e+00

Pos logit contributions
Run+0.077
 oh+0.075
 dear+0.072
 gust+0.071
 bye+0.071
Neg logit contributions
 ease-0.076
ws-0.071
nob-0.067
 stairs-0.067
ips-0.066
Token Cat
Token ID5181
Feature activation+5.34e+00

Pos logit contributions
 oh+0.170
 gust+0.165
Run+0.164
 dear+0.158
 sorry+0.150
Neg logit contributions
sey-0.127
ge-0.121
ips-0.119
 ease-0.116
zip-0.116
Token,
Token ID11
Feature activation+2.97e+00

Pos logit contributions
 oh+0.147
Run+0.138
 dear+0.136
 gust+0.135
 Please+0.132
Neg logit contributions
nob-0.089
sey-0.088
 ease-0.088
ws-0.087
ge-0.084
Token and
Token ID290
Feature activation+4.16e+00

Pos logit contributions
 oh+0.052
 dear+0.049
Run+0.047
 Please+0.045
 gust+0.045
Neg logit contributions
 ease-0.082
ws-0.079
zip-0.076
pack-0.076
nob-0.076
Token I
Token ID314
Feature activation-1.61e+00

Pos logit contributions
 gust+0.063
 oh+0.062
Run+0.060
 dear+0.057
 gru+0.054
Neg logit contributions
 ease-0.108
zip-0.106
 stairs-0.104
ws-0.104
 knots-0.101
Token am
Token ID716
Feature activation+2.58e+00

Pos logit contributions
 ease+0.040
 surfaces+0.038
ips+0.038
ws+0.038
zip+0.037
Neg logit contributions
 oh-0.034
 dear-0.033
 Please-0.031
Run-0.031
 sorry-0.029
Token Moon
Token ID6869
Feature activation+3.16e+00

Pos logit contributions
 oh+0.026
 dear+0.025
Run+0.024
 Please+0.024
 bye+0.023
Neg logit contributions
 ease-0.021
ws-0.020
zip-0.019
 surfaces-0.019
ips-0.019
Token who
Token ID508
Feature activation+4.17e-02

Pos logit contributions
 oh+0.077
Run+0.070
 dear+0.068
 Please+0.068
 Uncle+0.065
Neg logit contributions
 ease-0.068
ws-0.064
ics-0.063
nob-0.063
ge-0.063
Token said
Token ID531
Feature activation+2.13e+00

Pos logit contributions
 oh+0.001
 dear+0.001
 Please+0.001
Run+0.001
 Uncle+0.001
Neg logit contributions
 ease-0.001
ws-0.001
zip-0.001
ips-0.001
 surfaces-0.001
Token,
Token ID11
Feature activation+4.43e+00

Pos logit contributions
 oh+0.041
Run+0.039
 dear+0.037
 Please+0.036
 gust+0.035
Neg logit contributions
 ease-0.055
ws-0.050
nob-0.049
ics-0.049
 stairs-0.049
Token "
Token ID366
Feature activation-1.76e-01

Pos logit contributions
 oh+0.078
 dear+0.070
Run+0.068
 Please+0.063
 gust+0.063
Neg logit contributions
 ease-0.121
ws-0.112
 stairs-0.112
nob-0.111
zip-0.111
TokenLittle
Token ID22253
Feature activation+6.82e+00

Pos logit contributions
 ease+0.004
ws+0.004
 surfaces+0.004
ips+0.004
nob+0.004
Neg logit contributions
 oh-0.004
 Please-0.003
 dear-0.003
Run-0.003
 Uncle-0.003
Token one
Token ID530
Feature activation+2.13e+00

Pos logit contributions
 oh+0.153
 gust+0.144
Run+0.143
 sorry+0.142
 importantly+0.136
Neg logit contributions
sey-0.137
 ease-0.132
 stairs-0.130
nob-0.129
 destination-0.126
Token,
Token ID11
Feature activation+3.55e+00

Pos logit contributions
 oh+0.055
 dear+0.051
Run+0.050
 Please+0.050
 Uncle+0.049
Neg logit contributions
 ease-0.042
ws-0.041
zip-0.038
 stairs-0.038
nob-0.038
Token be
Token ID307
Feature activation+1.87e+00

Pos logit contributions
 oh+0.074
 dear+0.068
Run+0.067
 Please+0.065
 gust+0.064
Neg logit contributions
 ease-0.088
ws-0.084
zip-0.081
 stairs-0.080
nob-0.079
Token careful
Token ID8161
Feature activation+2.40e+00

Pos logit contributions
 oh+0.045
 dear+0.044
Run+0.042
 Please+0.041
 bye+0.040
Neg logit contributions
 ease-0.040
ws-0.039
zip-0.037
ips-0.037
nob-0.036
Token and
Token ID290
Feature activation+3.25e-01

Pos logit contributions
 oh+0.053
 dear+0.049
Run+0.048
 Please+0.047
 gust+0.045
Neg logit contributions
 ease-0.057
ws-0.055
zip-0.053
nob-0.053
 surfaces-0.053
Token as
Token ID355
Feature activation+2.84e+00

Pos logit contributions
ws+0.036
 ease+0.035
 surfaces+0.033
zip+0.033
angles+0.033
Neg logit contributions
 oh-0.019
 dear-0.018
 sorry-0.017
 Uncle-0.017
 Please-0.016
Token fast
Token ID3049
Feature activation+1.17e+00

Pos logit contributions
 oh+0.053
 dear+0.049
 Please+0.046
Run+0.045
 Uncle+0.045
Neg logit contributions
 ease-0.079
ws-0.077
zip-0.074
ips-0.073
 surfaces-0.073
Token as
Token ID355
Feature activation+2.96e+00

Pos logit contributions
 oh+0.021
 dear+0.020
Run+0.018
 Please+0.018
 Uncle+0.018
Neg logit contributions
 ease-0.033
ws-0.031
ips-0.030
zip-0.030
ge-0.030
Token he
Token ID339
Feature activation-2.96e+00

Pos logit contributions
 oh+0.054
 dear+0.052
 Please+0.050
Run+0.050
 gust+0.047
Neg logit contributions
 ease-0.084
ws-0.076
 stairs-0.075
zip-0.075
ge-0.075
Token could
Token ID714
Feature activation-4.41e+00

Pos logit contributions
 ease+0.076
ws+0.075
 surfaces+0.072
ips+0.071
ics+0.070
Neg logit contributions
 oh-0.061
 dear-0.057
 Please-0.055
Run-0.052
 bye-0.051
Token to
Token ID284
Feature activation-7.02e+00

Pos logit contributions
ws+0.133
 channels+0.129
 surfaces+0.125
ators+0.121
eds+0.121
Neg logit contributions
 oh-0.069
 dear-0.067
 sorry-0.063
 Please-0.062
 bye-0.062
Token fly
Token ID6129
Feature activation-2.75e+00

Pos logit contributions
 channels+0.155
oons+0.149
ips+0.149
ws+0.146
ushes+0.143
Neg logit contributions
 oh-0.160
 dear-0.156
 sorry-0.150
 bye-0.149
 named-0.149
Token the
Token ID262
Feature activation-2.71e+00

Pos logit contributions
ws+0.071
pack+0.068
ips+0.067
 ease+0.066
 driveway+0.065
Neg logit contributions
 oh-0.053
 Please-0.048
 dear-0.047
 hello-0.047
 gust-0.047
Token strip
Token ID10283
Feature activation-2.85e+00

Pos logit contributions
 ease+0.054
ws+0.051
zip+0.048
ips+0.047
nob+0.047
Neg logit contributions
 oh-0.073
 dear-0.071
 Please-0.067
Run-0.066
 Uncle-0.065
Token.
Token ID13
Feature activation-1.32e+00

Pos logit contributions
ws+0.081
 ease+0.077
zip+0.075
 surfaces+0.074
ips+0.073
Neg logit contributions
 oh-0.049
 dear-0.045
 Please-0.044
 gust-0.044
 bye-0.043
Token
Token ID198
Feature activation-2.86e+00

Pos logit contributions
 ease+0.030
ws+0.029
zip+0.028
ips+0.028
 surfaces+0.027
Neg logit contributions
 oh-0.031
 dear-0.029
 Please-0.028
Run-0.027
 Uncle-0.027
Token you
Token ID345
Feature activation-3.12e-01

Pos logit contributions
 oh+0.012
 dear+0.011
 Please+0.010
Run+0.010
 Uncle+0.010
Neg logit contributions
 ease-0.024
ws-0.023
zip-0.023
nob-0.022
pack-0.022
Token.
Token ID13
Feature activation+1.28e+00

Pos logit contributions
 ease+0.007
ws+0.007
zip+0.007
ips+0.007
nob+0.006
Neg logit contributions
 oh-0.007
 dear-0.007
Run-0.007
 Please-0.006
 Uncle-0.006
Token You
Token ID921
Feature activation-1.84e+00

Pos logit contributions
 oh+0.024
 gust+0.023
Run+0.022
 dear+0.021
 goat+0.021
Neg logit contributions
ws-0.034
 ease-0.034
ips-0.033
zip-0.033
nob-0.031
Token learned
Token ID4499
Feature activation+1.58e-01

Pos logit contributions
ge+0.039
 ease+0.039
ips+0.038
 surfaces+0.038
eds+0.038
Neg logit contributions
 dear-0.044
 oh-0.043
 Please-0.041
 sorry-0.040
 bye-0.039
Token how
Token ID703
Feature activation+5.80e-01

Pos logit contributions
 oh+0.003
 dear+0.003
 Please+0.003
 Uncle+0.003
Run+0.003
Neg logit contributions
 ease-0.004
ws-0.004
zip-0.004
ips-0.004
 surfaces-0.004
Token to
Token ID284
Feature activation-6.99e+00

Pos logit contributions
 oh+0.015
 dear+0.014
Run+0.014
 Please+0.013
 Uncle+0.013
Neg logit contributions
 ease-0.012
ws-0.011
zip-0.011
ips-0.011
pack-0.010
Token share
Token ID2648
Feature activation-2.48e+00

Pos logit contributions
ips+0.152
 ease+0.147
ushes+0.140
 stairs+0.139
packs+0.138
Neg logit contributions
 dear-0.157
 oh-0.157
 please-0.152
 Please-0.147
 named-0.146
Token and
Token ID290
Feature activation-2.15e+00

Pos logit contributions
 ease+0.078
 driveway+0.074
ips+0.073
ws+0.072
zip+0.072
Neg logit contributions
 oh-0.038
 dear-0.037
 Please-0.034
 hello-0.033
 bye-0.031
Token how
Token ID703
Feature activation+7.33e-01

Pos logit contributions
 ease+0.049
ips+0.048
ws+0.046
sey+0.045
nob+0.045
Neg logit contributions
 oh-0.053
 Please-0.050
 dear-0.048
 sorry-0.047
 bye-0.046
Token to
Token ID284
Feature activation-6.51e+00

Pos logit contributions
 oh+0.016
 dear+0.015
 Please+0.014
Run+0.014
 Uncle+0.014
Neg logit contributions
 ease-0.018
ws-0.017
zip-0.016
ips-0.016
 surfaces-0.016
Token be
Token ID307
Feature activation-1.48e+00

Pos logit contributions
ips+0.147
 stairs+0.136
ushes+0.132
 ease+0.132
packs+0.132
Neg logit contributions
 dear-0.148
 oh-0.144
 sorry-0.140
 please-0.139
 Please-0.133
Token and
Token ID290
Feature activation+2.85e-01

Pos logit contributions
 oh+0.012
 dear+0.012
 Please+0.011
 Uncle+0.011
Run+0.011
Neg logit contributions
 ease-0.016
ws-0.016
zip-0.016
ips-0.016
nob-0.015
Token soon
Token ID2582
Feature activation+1.45e+00

Pos logit contributions
 oh+0.006
 dear+0.006
 Please+0.005
 Uncle+0.005
 bye+0.005
Neg logit contributions
 ease-0.007
ws-0.007
ips-0.007
zip-0.007
nob-0.007
Token they
Token ID484
Feature activation-2.17e+00

Pos logit contributions
 oh+0.021
 dear+0.020
 bye+0.019
 Uncle+0.018
 Please+0.017
Neg logit contributions
ws-0.045
zip-0.044
 ease-0.044
pack-0.044
nob-0.043
Token were
Token ID547
Feature activation+1.77e-01

Pos logit contributions
 surfaces+0.057
ge+0.056
ips+0.056
 knots+0.055
eds+0.055
Neg logit contributions
 gust-0.032
 Harry-0.032
Run-0.031
 named-0.031
 bye-0.031
Token able
Token ID1498
Feature activation+3.67e-01

Pos logit contributions
 oh+0.004
 sorry+0.004
 dear+0.004
 gust+0.003
 Please+0.003
Neg logit contributions
ws-0.004
zip-0.004
 surfaces-0.004
 ease-0.004
nob-0.004
Token to
Token ID284
Feature activation-6.97e+00

Pos logit contributions
 oh+0.007
 dear+0.007
 Please+0.007
 Uncle+0.006
 sorry+0.006
Neg logit contributions
ws-0.010
 ease-0.009
zip-0.009
nob-0.009
 surfaces-0.009
Token stop
Token ID2245
Feature activation+1.13e+00

Pos logit contributions
ips+0.161
eds+0.158
 channels+0.144
ushes+0.142
ws+0.142
Neg logit contributions
 oh-0.155
 please-0.153
 dear-0.151
 sorry-0.148
 Please-0.141
Token him
Token ID683
Feature activation-1.82e+00

Pos logit contributions
 oh+0.025
 dear+0.023
 Please+0.022
 Uncle+0.022
Run+0.021
Neg logit contributions
ws-0.027
 ease-0.027
zip-0.026
nob-0.026
ips-0.025
Token from
Token ID422
Feature activation+8.85e-03

Pos logit contributions
ws+0.052
zip+0.051
 ease+0.051
ips+0.050
eds+0.049
Neg logit contributions
 oh-0.029
 dear-0.028
 Please-0.027
 sorry-0.026
 Uncle-0.026
Token scoring
Token ID9689
Feature activation-1.75e+00

Pos logit contributions
 oh+0.000
 Uncle+0.000
 dear+0.000
 Please+0.000
 sorry+0.000
Neg logit contributions
ws-0.000
zip-0.000
 ease-0.000
nob-0.000
ips-0.000
Token any
Token ID597
Feature activation-2.01e+00

Pos logit contributions
zip+0.040
pack+0.039
ws+0.039
 ease+0.039
nob+0.038
Neg logit contributions
 oh-0.038
 dear-0.037
 Please-0.035
 sorry-0.034
 gust-0.034
Token so
Token ID523
Feature activation+3.78e-02

Pos logit contributions
 ease+0.085
ws+0.082
zip+0.078
ips+0.077
 surfaces+0.076
Neg logit contributions
 oh-0.056
 dear-0.055
 Please-0.048
 sorry-0.046
 Uncle-0.045
Token she
Token ID673
Feature activation-1.31e+00

Pos logit contributions
 oh+0.001
Run+0.001
 dear+0.001
 Please+0.001
 Uncle+0.001
Neg logit contributions
 ease-0.001
ws-0.001
zip-0.001
 neatly-0.001
ips-0.001
Token would
Token ID561
Feature activation-5.28e+00

Pos logit contributions
 ease+0.033
ws+0.031
ips+0.030
 surfaces+0.030
ge+0.029
Neg logit contributions
 oh-0.029
 dear-0.027
 Please-0.026
Run-0.024
 Uncle-0.024
Token be
Token ID307
Feature activation-1.26e-02

Pos logit contributions
ips+0.120
eds+0.114
ws+0.109
 channels+0.107
 positions+0.104
Neg logit contributions
 dear-0.128
 please-0.121
 Please-0.120
 oh-0.113
 sorry-0.113
Token able
Token ID1498
Feature activation+6.12e-01

Pos logit contributions
 ease+0.000
ws+0.000
zip+0.000
ips+0.000
 surfaces+0.000
Neg logit contributions
 oh-0.000
 dear-0.000
 Please-0.000
Run-0.000
 Uncle-0.000
Token to
Token ID284
Feature activation-6.96e+00

Pos logit contributions
 oh+0.013
 dear+0.012
 Please+0.012
 sorry+0.011
 Uncle+0.011
Neg logit contributions
ws-0.016
 ease-0.016
zip-0.015
ips-0.015
 surfaces-0.015
Token open
Token ID1280
Feature activation-3.06e+00

Pos logit contributions
ips+0.179
eds+0.165
oons+0.163
 channels+0.163
ws+0.160
Neg logit contributions
 dear-0.147
 please-0.144
 oh-0.142
 named-0.129
 sorry-0.129
Token and
Token ID290
Feature activation-2.98e+00

Pos logit contributions
zag+0.099
ics+0.098
ws+0.097
seek+0.096
pack+0.094
Neg logit contributions
 oh-0.039
 dear-0.035
 Uncle-0.034
 Please-0.031
 hello-0.030
Token close
Token ID1969
Feature activation-8.68e-01

Pos logit contributions
ips+0.055
ws+0.053
 ease+0.052
eds+0.050
ics+0.050
Neg logit contributions
 oh-0.088
 dear-0.083
 Please-0.082
 Uncle-0.077
 sorry-0.076
Token the
Token ID262
Feature activation-3.91e+00

Pos logit contributions
ws+0.022
 ease+0.021
pack+0.021
zip+0.021
nob+0.020
Neg logit contributions
 oh-0.018
 dear-0.017
 Uncle-0.016
 Please-0.016
 hello-0.015
Token lock
Token ID5793
Feature activation-3.43e+00

Pos logit contributions
ics+0.090
 ease+0.088
ws+0.085
ips+0.082
zag+0.082
Neg logit contributions
 dear-0.097
 oh-0.091
 Please-0.091
 hello-0.087
 Can-0.084
Token together
Token ID1978
Feature activation-5.81e-01

Pos logit contributions
ws+0.005
 ease+0.005
ips+0.005
eds+0.005
 channels+0.005
Neg logit contributions
 oh-0.006
 dear-0.005
 Uncle-0.005
 Please-0.005
Run-0.005
Token he
Token ID339
Feature activation-1.97e+00

Pos logit contributions
 ease+0.019
ws+0.019
zip+0.018
ips+0.018
nob+0.018
Neg logit contributions
 oh-0.008
 dear-0.007
Run-0.006
 Uncle-0.006
 Please-0.006
Token and
Token ID290
Feature activation+5.64e-01

Pos logit contributions
ws+0.050
ips+0.050
 channels+0.050
 ease+0.049
 surfaces+0.049
Neg logit contributions
 oh-0.038
 dear-0.034
 gust-0.032
 goat-0.032
 named-0.031
Token Lisa
Token ID15378
Feature activation-2.15e+00

Pos logit contributions
 oh+0.012
 dear+0.011
 Please+0.010
 Uncle+0.010
Run+0.010
Neg logit contributions
 ease-0.015
ws-0.014
ips-0.014
zip-0.014
pack-0.014
Token managed
Token ID5257
Feature activation-3.24e+00

Pos logit contributions
 channels+0.055
ws+0.054
ips+0.054
 surfaces+0.054
eds+0.053
Neg logit contributions
 oh-0.037
 dear-0.035
 gust-0.035
 named-0.033
 curious-0.032
Token to
Token ID284
Feature activation-6.87e+00

Pos logit contributions
zip+0.099
ws+0.099
ips+0.098
 driveway+0.096
 channels+0.096
Neg logit contributions
 oh-0.045
 Uncle-0.042
 dear-0.041
 Sally-0.039
 hello-0.039
Token fix
Token ID4259
Feature activation-3.30e+00

Pos logit contributions
ips+0.178
eds+0.166
oons+0.160
ws+0.159
 channels+0.159
Neg logit contributions
 oh-0.144
 dear-0.136
 please-0.126
 Billy-0.124
 sorry-0.123
Token the
Token ID262
Feature activation-2.80e+00

Pos logit contributions
ws+0.091
pack+0.090
zip+0.089
 driveway+0.089
 ease+0.088
Neg logit contributions
 oh-0.063
 Uncle-0.056
 dear-0.056
 sorry-0.047
 Farmer-0.047
Token broken
Token ID5445
Feature activation-3.08e+00

Pos logit contributions
 ease+0.041
ws+0.038
zip+0.036
ips+0.035
nob+0.034
Neg logit contributions
 oh-0.090
 dear-0.086
 Please-0.083
Run-0.082
 gust-0.081
Token board
Token ID3096
Feature activation-3.82e+00

Pos logit contributions
 ease+0.045
ws+0.041
zip+0.039
ips+0.037
nob+0.037
Neg logit contributions
 oh-0.099
 dear-0.095
 Please-0.091
Run-0.090
 gust-0.090
Token.
Token ID13
Feature activation-1.74e+00

Pos logit contributions
ws+0.118
zip+0.114
 ease+0.114
ips+0.111
eds+0.110
Neg logit contributions
 oh-0.053
 dear-0.047
 Please-0.044
 sorry-0.044
 gust-0.043
Token managed
Token ID5257
Feature activation-2.62e+00

Pos logit contributions
ws+0.060
 ease+0.060
 surfaces+0.058
 channels+0.058
ips+0.058
Neg logit contributions
 oh-0.034
 Please-0.029
 dear-0.029
Run-0.029
 named-0.028
Token to
Token ID284
Feature activation-6.63e+00

Pos logit contributions
ws+0.077
zip+0.076
 ramp+0.076
pack+0.074
 driveway+0.073
Neg logit contributions
 oh-0.042
 Uncle-0.039
 hello-0.038
 dear-0.038
 bye-0.035
Token figure
Token ID3785
Feature activation-3.20e+00

Pos logit contributions
oons+0.154
ips+0.153
 channels+0.152
ws+0.149
eds+0.145
Neg logit contributions
 oh-0.150
 dear-0.142
 goat-0.136
 sorry-0.134
 hello-0.134
Token out
Token ID503
Feature activation-1.10e+00

Pos logit contributions
ws+0.071
 ease+0.068
 driveway+0.068
ips+0.067
pack+0.067
Neg logit contributions
 oh-0.077
 dear-0.071
 hello-0.069
 Uncle-0.069
 sorry-0.066
Token how
Token ID703
Feature activation+1.32e-01

Pos logit contributions
ws+0.030
pack+0.028
zip+0.028
 driveway+0.028
 ease+0.028
Neg logit contributions
 oh-0.022
 dear-0.021
 hello-0.019
 Uncle-0.018
 sorry-0.018
Token to
Token ID284
Feature activation-6.86e+00

Pos logit contributions
 oh+0.003
 dear+0.003
 Please+0.003
Run+0.003
 Uncle+0.003
Neg logit contributions
 ease-0.003
ws-0.003
ips-0.003
zip-0.003
nob-0.003
Token move
Token ID1445
Feature activation-1.71e+00

Pos logit contributions
 channels+0.151
ips+0.149
ws+0.148
eds+0.144
oons+0.144
Neg logit contributions
 oh-0.163
 dear-0.154
 please-0.142
 goat-0.141
 sorry-0.141
Token the
Token ID262
Feature activation-2.93e+00

Pos logit contributions
ws+0.042
 ease+0.041
pack+0.039
zip+0.039
ips+0.038
Neg logit contributions
 oh-0.037
 dear-0.034
 Please-0.032
 Uncle-0.032
 hello-0.032
Token wagon
Token ID31860
Feature activation-3.49e+00

Pos logit contributions
ws+0.071
ics+0.067
 ease+0.066
ips+0.066
eds+0.065
Neg logit contributions
 dear-0.066
 oh-0.063
 goat-0.062
 Please-0.061
 gust-0.059
Token by
Token ID416
Feature activation-1.61e+00

Pos logit contributions
ws+0.092
 ease+0.087
zip+0.084
eds+0.083
ips+0.083
Neg logit contributions
 oh-0.070
 Please-0.063
 dear-0.062
 gust-0.059
Run-0.058
Token herself
Token ID5223
Feature activation-1.87e+00

Pos logit contributions
ws+0.036
 ease+0.034
zip+0.034
ips+0.033
pack+0.033
Neg logit contributions
 oh-0.039
 dear-0.036
 Uncle-0.033
Run-0.033
 sorry-0.033
Token kids
Token ID3988
Feature activation+1.33e+00

Pos logit contributions
 oh+0.067
Run+0.060
 dear+0.059
 Please+0.058
 gru+0.055
Neg logit contributions
 ease-0.085
ws-0.082
zip-0.079
ips-0.078
 stairs-0.077
Token.
Token ID13
Feature activation+1.79e+00

Pos logit contributions
 oh+0.033
 dear+0.031
Run+0.030
 Please+0.029
 Uncle+0.028
Neg logit contributions
 ease-0.029
ws-0.028
zip-0.027
ips-0.026
pack-0.026
Token You
Token ID921
Feature activation-1.54e+00

Pos logit contributions
 oh+0.033
Run+0.030
 gust+0.030
 gru+0.029
 goat+0.029
Neg logit contributions
zip-0.047
ips-0.047
 ease-0.047
ws-0.046
nob-0.044
Token learned
Token ID4499
Feature activation+3.54e-01

Pos logit contributions
 ease+0.027
ge+0.027
ws+0.026
ips+0.026
 surfaces+0.026
Neg logit contributions
 oh-0.042
 dear-0.042
 Please-0.040
 sorry-0.039
 bye-0.039
Token how
Token ID703
Feature activation+5.97e-01

Pos logit contributions
 oh+0.008
 dear+0.007
 Please+0.007
Run+0.007
 Uncle+0.007
Neg logit contributions
 ease-0.009
ws-0.009
zip-0.008
ips-0.008
 surfaces-0.008
Token to
Token ID284
Feature activation-6.86e+00

Pos logit contributions
 oh+0.016
 dear+0.015
Run+0.015
 Uncle+0.014
 Please+0.014
Neg logit contributions
 ease-0.011
ws-0.011
zip-0.010
ips-0.010
pack-0.010
Token solve
Token ID8494
Feature activation-3.89e+00

Pos logit contributions
 ease+0.150
ips+0.148
 channels+0.141
 stairs+0.141
ics+0.137
Neg logit contributions
 dear-0.158
 oh-0.153
 Please-0.142
 named-0.141
 please-0.140
Token your
Token ID534
Feature activation-3.73e+00

Pos logit contributions
 driveway+0.090
 ease+0.085
pack+0.084
ics+0.082
nob+0.082
Neg logit contributions
 oh-0.090
 dear-0.086
 Uncle-0.077
 Please-0.075
 sorry-0.075
Token problem
Token ID1917
Feature activation-2.84e+00

Pos logit contributions
 ease+0.088
zip+0.081
ics+0.081
ws+0.080
ge+0.078
Neg logit contributions
 oh-0.086
 dear-0.085
 Please-0.080
 bye-0.078
 Uncle-0.077
Token and
Token ID290
Feature activation-1.87e+00

Pos logit contributions
 ease+0.090
zip+0.082
 driveway+0.082
ws+0.082
nob+0.080
Neg logit contributions
 oh-0.045
 dear-0.043
 Please-0.040
 Uncle-0.036
 Can-0.036
Token be
Token ID307
Feature activation-4.97e-01

Pos logit contributions
 ease+0.047
ws+0.043
ips+0.043
nob+0.043
sey+0.042
Neg logit contributions
 oh-0.044
 Please-0.040
 dear-0.038
 sorry-0.037
 bye-0.036
Token,
Token ID11
Feature activation+2.36e+00

Pos logit contributions
 oh+0.051
Run+0.047
 Please+0.045
 dear+0.044
 goat+0.042
Neg logit contributions
 ease-0.082
ws-0.080
 stairs-0.080
ips-0.078
ics-0.078
Token Sara
Token ID24799
Feature activation+3.48e+00

Pos logit contributions
 oh+0.050
Run+0.045
 Please+0.045
 gru+0.044
 dear+0.044
Neg logit contributions
 ease-0.056
ws-0.055
zip-0.052
 stairs-0.052
ips-0.052
Token!
Token ID0
Feature activation+2.51e+00

Pos logit contributions
 oh+0.092
Run+0.087
 gust+0.084
 Please+0.082
 hello+0.081
Neg logit contributions
ws-0.065
 ease-0.060
nob-0.060
eds-0.060
ics-0.059
Token You
Token ID921
Feature activation-1.27e+00

Pos logit contributions
 oh+0.047
 gust+0.043
 gru+0.043
Run+0.042
 goat+0.040
Neg logit contributions
ips-0.065
ws-0.064
 ease-0.063
zip-0.062
sey-0.059
Token managed
Token ID5257
Feature activation-1.71e+00

Pos logit contributions
ge+0.028
 ease+0.027
ips+0.027
 surfaces+0.026
eds+0.026
Neg logit contributions
 dear-0.030
 oh-0.030
 Please-0.028
 sorry-0.027
Run-0.027
Token to
Token ID284
Feature activation-6.83e+00

Pos logit contributions
 ease+0.051
zip+0.049
ips+0.049
 driveway+0.048
 ramp+0.048
Neg logit contributions
 dear-0.028
 oh-0.028
 Please-0.025
 Uncle-0.025
 bye-0.024
Token plant
Token ID4618
Feature activation-3.18e+00

Pos logit contributions
ips+0.156
 channels+0.153
 ease+0.139
ge+0.138
ws+0.138
Neg logit contributions
 dear-0.158
 oh-0.154
 please-0.140
 insect-0.139
 named-0.137
Token your
Token ID534
Feature activation-2.85e+00

Pos logit contributions
 driveway+0.086
 ease+0.086
ge+0.084
pack+0.082
zip+0.082
Neg logit contributions
 oh-0.062
 dear-0.058
 hello-0.050
 Uncle-0.050
 sorry-0.048
Token first
Token ID717
Feature activation-1.35e+00

Pos logit contributions
 ease+0.060
ws+0.059
zip+0.057
ge+0.057
ips+0.056
Neg logit contributions
 dear-0.074
 oh-0.072
 Please-0.064
 sorry-0.064
 hello-0.062
Token bulb
Token ID28287
Feature activation-1.75e+00

Pos logit contributions
 ease+0.029
ws+0.028
ips+0.027
zip+0.027
ics+0.026
Neg logit contributions
 dear-0.034
 oh-0.034
 Please-0.031
 bye-0.029
 gust-0.029
Token!"
Token ID2474
Feature activation+5.06e-01

Pos logit contributions
 ease+0.049
ws+0.046
zip+0.045
ips+0.045
pack+0.043
Neg logit contributions
 oh-0.033
 dear-0.033
 Please-0.029
 Uncle-0.027
 bye-0.027
Token.
Token ID13
Feature activation-1.09e+00

Pos logit contributions
 ease+0.057
ws+0.056
zip+0.055
ips+0.053
eds+0.052
Neg logit contributions
 oh-0.035
 dear-0.031
 Please-0.031
 sorry-0.030
 Uncle-0.030
Token They
Token ID1119
Feature activation-3.49e+00

Pos logit contributions
 ease+0.026
ws+0.025
zip+0.024
ips+0.024
 surfaces+0.024
Neg logit contributions
 oh-0.024
 dear-0.023
 Please-0.022
Run-0.022
 Uncle-0.021
Token clip
Token ID10651
Feature activation-5.06e+00

Pos logit contributions
eds+0.086
ips+0.080
keeping+0.073
orns+0.073
ics+0.073
Neg logit contributions
 sorry-0.068
 curious-0.068
Run-0.068
 Please-0.067
 hello-0.067
Token the
Token ID262
Feature activation-2.65e+00

Pos logit contributions
nob+0.153
 ease+0.149
 driveway+0.148
|+0.145
 achieve+0.145
Neg logit contributions
 oh-0.075
 dear-0.063
 Uncle-0.062
 bye-0.060
 hello-0.059
Token wires
Token ID19474
Feature activation-3.28e+00

Pos logit contributions
ws+0.060
ics+0.060
nob+0.057
ips+0.056
 ease+0.055
Neg logit contributions
 dear-0.064
 oh-0.063
 sorry-0.062
 Please-0.060
 Can-0.058
Token to
Token ID284
Feature activation-6.81e+00

Pos logit contributions
 ease+0.098
ws+0.097
zip+0.095
nob+0.094
 surfaces+0.093
Neg logit contributions
 oh-0.045
 dear-0.044
 Please-0.044
 sorry-0.043
 hello-0.041
Token their
Token ID511
Feature activation-3.59e+00

Pos logit contributions
ips+0.142
ics+0.134
eds+0.132
cast+0.129
ws+0.129
Neg logit contributions
 oh-0.174
 dear-0.155
 Uncle-0.154
 sorry-0.151
 please-0.150
Token toy
Token ID13373
Feature activation-3.60e+00

Pos logit contributions
 ease+0.078
ws+0.076
ics+0.076
ips+0.075
nob+0.075
Neg logit contributions
 oh-0.087
 dear-0.086
 Uncle-0.083
 Please-0.080
 uncle-0.080
Token car
Token ID1097
Feature activation-3.74e+00

Pos logit contributions
ws+0.085
 ease+0.085
aches+0.074
nob+0.074
zip+0.074
Neg logit contributions
 oh-0.083
 dear-0.076
 Please-0.075
 Uncle-0.072
 goat-0.071
Token and
Token ID290
Feature activation-2.62e+00

Pos logit contributions
 ease+0.121
ws+0.115
ips+0.110
eds+0.109
 surfaces+0.108
Neg logit contributions
 oh-0.050
 dear-0.047
 Please-0.045
 Uncle-0.045
 bye-0.041
Token the
Token ID262
Feature activation-2.25e+00

Pos logit contributions
ips+0.053
eds+0.053
 ease+0.052
ics+0.051
nob+0.049
Neg logit contributions
 oh-0.068
 dear-0.065
 Uncle-0.063
 Please-0.063
 bye-0.062
TokenWhy
Token ID5195
Feature activation-3.16e-02

Pos logit contributions
 oh+0.011
 dear+0.010
 Please+0.010
Run+0.009
 Uncle+0.009
Neg logit contributions
 ease-0.012
ws-0.012
zip-0.011
ips-0.011
nob-0.011
Token don
Token ID836
Feature activation-1.50e-01

Pos logit contributions
 ease+0.001
ws+0.001
ips+0.001
zip+0.001
ics+0.001
Neg logit contributions
 oh-0.001
Run-0.001
 dear-0.001
 Please-0.001
 gru-0.001
Token't
Token ID470
Feature activation+4.19e-01

Pos logit contributions
 ease+0.002
ws+0.002
zip+0.002
ips+0.002
 surfaces+0.002
Neg logit contributions
 oh-0.005
 dear-0.005
 Please-0.004
Run-0.004
 Uncle-0.004
Token you
Token ID345
Feature activation-3.83e+00

Pos logit contributions
 oh+0.008
 dear+0.007
Run+0.007
 Uncle+0.007
 Please+0.007
Neg logit contributions
 ease-0.011
ws-0.011
zip-0.011
 surfaces-0.011
 rebuild-0.011
Token try
Token ID1949
Feature activation-1.93e+00

Pos logit contributions
 ease+0.065
ws+0.065
ips+0.062
 stairs+0.061
nob+0.060
Neg logit contributions
 oh-0.108
 dear-0.107
 Please-0.103
Run-0.099
 sorry-0.097
Token to
Token ID284
Feature activation-6.80e+00

Pos logit contributions
nob+0.054
 ease+0.054
ws+0.054
 driveway+0.053
zip+0.053
Neg logit contributions
 oh-0.032
 dear-0.031
 sorry-0.028
 Uncle-0.028
 hello-0.028
Token make
Token ID787
Feature activation-3.22e+00

Pos logit contributions
 channels+0.151
ws+0.143
 stairs+0.143
ips+0.139
oons+0.139
Neg logit contributions
 dear-0.158
 please-0.144
 oh-0.143
 sorry-0.141
 pigeon-0.139
Token something
Token ID1223
Feature activation+1.67e-02

Pos logit contributions
 driveway+0.088
nob+0.088
zag+0.086
 ease+0.086
ws+0.085
Neg logit contributions
 oh-0.061
 dear-0.059
 sorry-0.052
 gust-0.050
 Please-0.050
Token else
Token ID2073
Feature activation-5.05e-01

Pos logit contributions
 oh+0.000
 dear+0.000
 Please+0.000
Run+0.000
 Uncle+0.000
Neg logit contributions
 ease-0.000
ws-0.000
zip-0.000
ips-0.000
nob-0.000
Token with
Token ID351
Feature activation-1.98e+00

Pos logit contributions
 ease+0.015
ws+0.014
zip+0.014
nob+0.013
ips+0.013
Neg logit contributions
 oh-0.009
 dear-0.009
 Please-0.008
Run-0.007
 sorry-0.007
Token the
Token ID262
Feature activation-3.93e+00

Pos logit contributions
 ease+0.047
nob+0.046
ws+0.046
zip+0.045
ips+0.045
Neg logit contributions
 oh-0.043
 dear-0.042
 Please-0.038
 Uncle-0.037
 sorry-0.036
Token There
Token ID1318
Feature activation-2.73e+00

Pos logit contributions
 ease+0.026
ws+0.025
zip+0.024
 stairs+0.024
 surfaces+0.024
Neg logit contributions
 oh-0.026
 dear-0.024
Run-0.023
 Please-0.022
 bye-0.022
Token were
Token ID547
Feature activation-3.34e-01

Pos logit contributions
 ease+0.067
ws+0.062
ips+0.061
zip+0.061
ics+0.060
Neg logit contributions
 oh-0.060
 dear-0.057
 Please-0.055
Run-0.054
 bye-0.051
Token so
Token ID523
Feature activation+9.53e-01

Pos logit contributions
ws+0.006
 ease+0.006
ge+0.006
zip+0.006
nob+0.006
Neg logit contributions
 oh-0.010
 dear-0.009
 Please-0.008
 Uncle-0.008
 hello-0.008
Token many
Token ID867
Feature activation-1.42e+00

Pos logit contributions
 oh+0.027
 dear+0.026
 Please+0.024
 sorry+0.024
Run+0.024
Neg logit contributions
 ease-0.018
ws-0.018
nob-0.016
zip-0.016
ge-0.016
Token sticks
Token ID16461
Feature activation-2.13e+00

Pos logit contributions
 ease+0.023
ws+0.022
zip+0.021
ips+0.020
nob+0.020
Neg logit contributions
 oh-0.043
 dear-0.041
 Please-0.039
Run-0.039
 Uncle-0.038
Token to
Token ID284
Feature activation-6.75e+00

Pos logit contributions
 ease+0.063
zip+0.062
ws+0.062
 surfaces+0.060
pack+0.059
Neg logit contributions
 oh-0.033
 dear-0.032
Run-0.028
 hello-0.028
 Please-0.028
Token choose
Token ID3853
Feature activation-1.91e+00

Pos logit contributions
ips+0.147
 channels+0.145
aries+0.131
ws+0.130
eds+0.130
Neg logit contributions
 oh-0.163
 named-0.149
 goat-0.147
 dear-0.146
Run-0.143
Token from
Token ID422
Feature activation-7.05e-01

Pos logit contributions
 ease+0.054
ws+0.052
 driveway+0.051
zip+0.051
ips+0.050
Neg logit contributions
 oh-0.034
 dear-0.033
 hello-0.029
 Uncle-0.028
 Please-0.027
Token!
Token ID0
Feature activation-1.38e+00

Pos logit contributions
ws+0.018
 ease+0.018
zip+0.018
ips+0.017
 driveway+0.017
Neg logit contributions
 oh-0.014
 dear-0.013
 Uncle-0.012
 goat-0.012
 Please-0.011
Token The
Token ID383
Feature activation-5.48e-01

Pos logit contributions
 ease+0.035
ws+0.033
zip+0.032
ips+0.032
 stairs+0.032
Neg logit contributions
 oh-0.029
 dear-0.027
Run-0.026
 Please-0.025
 bye-0.025
Token girl
Token ID2576
Feature activation-8.50e-02

Pos logit contributions
 ease+0.019
ws+0.018
zip+0.018
ips+0.017
nob+0.017
Neg logit contributions
 oh-0.007
 dear-0.006
 Please-0.006
Run-0.006
 Uncle-0.005
Token different
Token ID1180
Feature activation+7.79e-02

Pos logit contributions
 oh+0.004
 dear+0.004
 Please+0.004
Run+0.004
 Uncle+0.003
Neg logit contributions
 ease-0.005
ws-0.005
zip-0.004
ips-0.004
 surfaces-0.004
Token.
Token ID13
Feature activation+1.43e+00

Pos logit contributions
 oh+0.002
 dear+0.002
 Please+0.002
Run+0.002
 Uncle+0.002
Neg logit contributions
 ease-0.002
ws-0.002
zip-0.002
nob-0.002
ips-0.002
Token You
Token ID921
Feature activation-2.30e+00

Pos logit contributions
 oh+0.027
 gust+0.026
Run+0.024
 dear+0.024
 gru+0.023
Neg logit contributions
ips-0.037
ws-0.037
zip-0.037
 ease-0.036
 chooses-0.035
Token learned
Token ID4499
Feature activation+1.03e-02

Pos logit contributions
 surfaces+0.055
 ease+0.055
eds+0.053
ge+0.053
ips+0.052
Neg logit contributions
 dear-0.050
 Please-0.049
 oh-0.047
 sorry-0.046
 hello-0.043
Token how
Token ID703
Feature activation+2.89e-01

Pos logit contributions
 oh+0.000
 dear+0.000
 Please+0.000
 Uncle+0.000
 bye+0.000
Neg logit contributions
 ease-0.000
ws-0.000
ips-0.000
zip-0.000
 surfaces-0.000
Token to
Token ID284
Feature activation-6.75e+00

Pos logit contributions
 oh+0.007
 dear+0.007
 Please+0.007
Run+0.006
 Uncle+0.006
Neg logit contributions
 ease-0.006
ws-0.006
zip-0.006
ips-0.006
nob-0.006
Token zip
Token ID19974
Feature activation-4.15e+00

Pos logit contributions
ips+0.146
eds+0.145
 channels+0.142
ushes+0.138
 ease+0.138
Neg logit contributions
 dear-0.146
 oh-0.144
 please-0.141
 Please-0.139
 sorry-0.133
Token the
Token ID262
Feature activation-3.13e+00

Pos logit contributions
 driveway+0.114
 ease+0.112
nob+0.111
pack+0.109
ics+0.108
Neg logit contributions
 oh-0.070
 dear-0.070
 Please-0.066
 hello-0.066
 Can-0.064
Token ball
Token ID2613
Feature activation-4.83e+00

Pos logit contributions
ws+0.066
 ease+0.066
ge+0.064
nob+0.062
ics+0.062
Neg logit contributions
 oh-0.077
 Please-0.074
 dear-0.074
 Can-0.070
 bye-0.069
Token.
Token ID13
Feature activation+9.34e-01

Pos logit contributions
 ease+0.138
 driveway+0.130
 construct+0.128
 surfaces+0.127
irs+0.126
Neg logit contributions
 oh-0.077
 Please-0.075
 dear-0.069
 Can-0.067
 bye-0.067
Token You
Token ID921
Feature activation-3.48e+00

Pos logit contributions
 oh+0.017
 gust+0.015
 dear+0.015
Run+0.015
 goat+0.014
Neg logit contributions
 ease-0.026
ws-0.026
zip-0.025
ips-0.025
nob-0.024
Token children
Token ID1751
Feature activation+1.73e+00

Pos logit contributions
 oh+0.043
 gru+0.040
Run+0.040
 dear+0.039
 gust+0.038
Neg logit contributions
ips-0.039
 ease-0.036
ws-0.035
 stairs-0.034
 blocks-0.034
Token.
Token ID13
Feature activation+2.29e+00

Pos logit contributions
 oh+0.047
Run+0.045
 dear+0.044
 gust+0.043
 Please+0.041
Neg logit contributions
 ease-0.031
ws-0.030
ips-0.029
zip-0.028
pack-0.028
Token You
Token ID921
Feature activation-1.74e+00

Pos logit contributions
 oh+0.045
 gust+0.044
Run+0.040
 dear+0.040
 gru+0.038
Neg logit contributions
ips-0.058
 ease-0.057
ws-0.057
zip-0.056
 chooses-0.054
Token learned
Token ID4499
Feature activation+3.48e-01

Pos logit contributions
ge+0.038
 ease+0.038
 surfaces+0.038
ips+0.037
eds+0.037
Neg logit contributions
 dear-0.040
 oh-0.040
 Please-0.038
 sorry-0.038
Run-0.037
Token how
Token ID703
Feature activation+5.07e-01

Pos logit contributions
 oh+0.007
 dear+0.007
Run+0.007
 Please+0.007
 gust+0.006
Neg logit contributions
 ease-0.009
ws-0.008
zip-0.008
ips-0.008
nob-0.008
Token to
Token ID284
Feature activation-6.74e+00

Pos logit contributions
 oh+0.013
 dear+0.013
Run+0.012
 Please+0.012
 Uncle+0.012
Neg logit contributions
 ease-0.010
ws-0.010
zip-0.009
ips-0.009
pack-0.009
Token share
Token ID2648
Feature activation-2.37e+00

Pos logit contributions
ips+0.172
 ease+0.161
oons+0.160
ushes+0.160
cast+0.158
Neg logit contributions
 dear-0.132
 oh-0.128
 hello-0.122
 please-0.119
 bye-0.118
Token and
Token ID290
Feature activation-2.03e+00

Pos logit contributions
 ease+0.071
 driveway+0.067
ips+0.067
ws+0.067
nob+0.066
Neg logit contributions
 oh-0.041
 dear-0.039
 Please-0.036
 hello-0.035
 Uncle-0.034
Token be
Token ID307
Feature activation-7.25e-01

Pos logit contributions
 ease+0.047
ips+0.046
ws+0.045
nob+0.043
sey+0.043
Neg logit contributions
 oh-0.049
 Please-0.046
 dear-0.045
 bye-0.044
 hello-0.044
Token kind
Token ID1611
Feature activation+5.63e-01

Pos logit contributions
 ease+0.025
ws+0.024
ips+0.024
ge+0.024
 surfaces+0.024
Neg logit contributions
 oh-0.009
 dear-0.008
 Please-0.007
 sorry-0.007
 bye-0.007
Token.
Token ID13
Feature activation+1.68e+00

Pos logit contributions
 oh+0.014
 dear+0.013
Run+0.012
 Please+0.012
 Uncle+0.012
Neg logit contributions
 ease-0.012
ws-0.012
zip-0.011
ips-0.011
 surfaces-0.011
Token tell
Token ID1560
Feature activation-9.06e-02

Pos logit contributions
 oh+0.006
 dear+0.006
 Please+0.005
 Uncle+0.005
 bye+0.005
Neg logit contributions
ws-0.006
ips-0.006
 ease-0.006
zip-0.006
nob-0.006
Token them
Token ID606
Feature activation+1.26e-01

Pos logit contributions
 ease+0.002
ws+0.002
zip+0.001
ips+0.001
pack+0.001
Neg logit contributions
 oh-0.003
 dear-0.003
 Uncle-0.002
 Please-0.002
 hello-0.002
Token how
Token ID703
Feature activation+1.45e+00

Pos logit contributions
 oh+0.003
 dear+0.003
 hello+0.002
 Uncle+0.002
 bye+0.002
Neg logit contributions
ws-0.003
 ease-0.003
ips-0.003
zip-0.003
 driveway-0.003
Token he
Token ID339
Feature activation-2.32e+00

Pos logit contributions
 oh+0.030
 dear+0.028
 Please+0.026
Run+0.026
 Uncle+0.025
Neg logit contributions
 ease-0.037
ws-0.036
zip-0.035
ips-0.035
nob-0.034
Token managed
Token ID5257
Feature activation-2.84e+00

Pos logit contributions
 ease+0.064
ws+0.063
ips+0.063
 driveway+0.062
ics+0.061
Neg logit contributions
 oh-0.042
 dear-0.040
 Please-0.036
Run-0.034
 gust-0.032
Token to
Token ID284
Feature activation-6.73e+00

Pos logit contributions
ips+0.080
zip+0.079
ws+0.078
 driveway+0.078
pack+0.078
Neg logit contributions
 oh-0.047
 hello-0.044
 dear-0.043
 Uncle-0.042
 Please-0.039
Token make
Token ID787
Feature activation-2.78e+00

Pos logit contributions
ws+0.163
ips+0.160
eds+0.159
 channels+0.157
rones+0.156
Neg logit contributions
 dear-0.141
 oh-0.137
 hello-0.125
 please-0.124
 gust-0.118
Token his
Token ID465
Feature activation-1.87e+00

Pos logit contributions
ws+0.074
 driveway+0.071
zip+0.071
nob+0.071
 ease+0.070
Neg logit contributions
 oh-0.056
 dear-0.052
 Uncle-0.049
 hello-0.046
 sorry-0.045
Token hair
Token ID4190
Feature activation-2.99e+00

Pos logit contributions
ws+0.039
 ease+0.038
zip+0.036
ge+0.036
ips+0.036
Neg logit contributions
 dear-0.049
 oh-0.048
 Please-0.044
 bye-0.044
 sorry-0.044
Token so
Token ID523
Feature activation+3.20e-01

Pos logit contributions
ws+0.072
 ease+0.068
ips+0.067
irs+0.065
eds+0.064
Neg logit contributions
 oh-0.067
 dear-0.067
 sorry-0.059
 gust-0.056
 Please-0.056
Token perfect
Token ID2818
Feature activation+1.85e-02

Pos logit contributions
 dear+0.006
 oh+0.006
 sorry+0.006
 bye+0.005
 Please+0.005
Neg logit contributions
ws-0.009
 ease-0.008
nob-0.008
 driveway-0.008
zip-0.008
Token.
Token ID13
Feature activation-1.34e+00

Pos logit contributions
 oh+0.042
 dear+0.039
Run+0.037
 Please+0.037
 Uncle+0.036
Neg logit contributions
 ease-0.042
ws-0.040
zip-0.039
ips-0.039
 surfaces-0.038
Token He
Token ID679
Feature activation-1.37e+00

Pos logit contributions
 ease+0.038
ws+0.036
ge+0.036
zip+0.036
pack+0.035
Neg logit contributions
 dear-0.023
 oh-0.023
 Please-0.022
 Uncle-0.021
 bye-0.021
Token taught
Token ID7817
Feature activation-1.47e+00

Pos logit contributions
 ease+0.036
 surfaces+0.035
ws+0.035
ips+0.035
ge+0.034
Neg logit contributions
 oh-0.026
 dear-0.025
 Uncle-0.023
 goat-0.023
 bye-0.023
Token them
Token ID606
Feature activation-5.61e-01

Pos logit contributions
ws+0.029
 driveway+0.029
ips+0.029
nob+0.028
pack+0.028
Neg logit contributions
 oh-0.037
 dear-0.036
 Uncle-0.035
 hello-0.035
 bye-0.033
Token how
Token ID703
Feature activation+4.49e-01

Pos logit contributions
 ease+0.014
ws+0.014
zip+0.014
ips+0.013
 driveway+0.013
Neg logit contributions
 oh-0.012
 dear-0.011
 hello-0.010
 Uncle-0.010
 bye-0.010
Token to
Token ID284
Feature activation-6.73e+00

Pos logit contributions
 oh+0.012
 dear+0.011
Run+0.011
 Please+0.010
 Uncle+0.010
Neg logit contributions
 ease-0.009
ws-0.009
zip-0.008
ips-0.008
 surfaces-0.008
Token cook
Token ID4255
Feature activation-2.75e+00

Pos logit contributions
ips+0.139
 ease+0.135
ws+0.131
 stairs+0.130
eds+0.130
Neg logit contributions
 dear-0.165
 oh-0.159
 Please-0.152
 please-0.150
 sorry-0.144
Token,
Token ID11
Feature activation+2.40e-01

Pos logit contributions
 ease+0.069
ws+0.067
 driveway+0.066
zip+0.066
ge+0.065
Neg logit contributions
 oh-0.058
 dear-0.054
 Uncle-0.050
 Please-0.049
 sorry-0.049
Token how
Token ID703
Feature activation+8.21e-01

Pos logit contributions
 oh+0.005
 dear+0.004
 Please+0.004
 Uncle+0.004
 hello+0.004
Neg logit contributions
 ease-0.006
ws-0.006
zip-0.006
ips-0.006
nob-0.006
Token to
Token ID284
Feature activation-5.91e+00

Pos logit contributions
 oh+0.019
 dear+0.018
 Please+0.017
Run+0.017
 Uncle+0.016
Neg logit contributions
 ease-0.019
ws-0.018
zip-0.018
ips-0.017
 surfaces-0.017
Token play
Token ID711
Feature activation-2.57e+00

Pos logit contributions
ips+0.120
 ease+0.113
ws+0.111
 stairs+0.110
packs+0.109
Neg logit contributions
 dear-0.149
 oh-0.146
 Please-0.138
 please-0.138
 sorry-0.136
Token impressed
Token ID12617
Feature activation+2.41e+00

Pos logit contributions
 oh+0.027
 dear+0.026
 Please+0.023
Run+0.023
 Uncle+0.022
Neg logit contributions
 ease-0.046
ws-0.045
zip-0.043
ips-0.043
 surfaces-0.043
Token with
Token ID351
Feature activation-3.75e-01

Pos logit contributions
 oh+0.068
 gust+0.066
 dear+0.065
Run+0.065
 gru+0.062
Neg logit contributions
 ease-0.038
ips-0.037
 neatly-0.036
zip-0.036
pack-0.035
Token how
Token ID703
Feature activation+1.60e+00

Pos logit contributions
 ease+0.008
ws+0.008
zip+0.008
ips+0.008
nob+0.008
Neg logit contributions
 oh-0.009
 dear-0.008
 Uncle-0.008
 Please-0.008
Run-0.008
Token he
Token ID339
Feature activation-1.49e+00

Pos logit contributions
 oh+0.035
 dear+0.032
Run+0.031
 Please+0.030
 Uncle+0.029
Neg logit contributions
 ease-0.040
ws-0.038
zip-0.037
ips-0.036
 surfaces-0.036
Token managed
Token ID5257
Feature activation-2.69e+00

Pos logit contributions
 ease+0.037
ws+0.036
ips+0.035
 surfaces+0.034
 driveway+0.034
Neg logit contributions
 oh-0.032
 dear-0.030
 Please-0.029
Run-0.028
 Can-0.026
Token to
Token ID284
Feature activation-6.73e+00

Pos logit contributions
zip+0.078
ips+0.077
 ramp+0.075
pack+0.075
ws+0.075
Neg logit contributions
 oh-0.041
 dear-0.039
 Uncle-0.038
 hello-0.038
 Harry-0.036
Token try
Token ID1949
Feature activation-2.82e+00

Pos logit contributions
ips+0.150
ws+0.140
eds+0.136
ics+0.134
 channels+0.134
Neg logit contributions
 dear-0.159
 oh-0.152
 please-0.144
 Please-0.139
 insect-0.134
Token and
Token ID290
Feature activation-2.27e+00

Pos logit contributions
zip+0.087
ws+0.086
 driveway+0.085
pack+0.084
|+0.084
Neg logit contributions
 oh-0.037
 Uncle-0.036
 hello-0.035
 dear-0.035
 Harry-0.033
Token make
Token ID787
Feature activation-2.77e+00

Pos logit contributions
ips+0.046
ws+0.044
eds+0.042
sey+0.040
zip+0.040
Neg logit contributions
 oh-0.061
 dear-0.059
 Please-0.057
 sorry-0.055
 hello-0.053
Token something
Token ID1223
Feature activation+4.48e-01

Pos logit contributions
ws+0.071
ips+0.071
zip+0.070
nob+0.070
 driveway+0.070
Neg logit contributions
 oh-0.056
 dear-0.053
 Uncle-0.049
 sorry-0.046
 hello-0.045
Token so
Token ID523
Feature activation+1.17e+00

Pos logit contributions
 oh+0.010
 dear+0.010
 Please+0.009
Run+0.009
 Uncle+0.009
Neg logit contributions
 ease-0.011
ws-0.011
zip-0.010
ips-0.010
nob-0.010
Token the
Token ID262
Feature activation-2.09e+00

Pos logit contributions
 ease+0.090
ws+0.085
 driveway+0.085
|+0.084
zip+0.083
Neg logit contributions
 oh-0.068
 dear-0.059
 Uncle-0.055
 Please-0.051
 hello-0.048
Token black
Token ID2042
Feature activation-8.43e-01

Pos logit contributions
 ease+0.057
ws+0.055
zip+0.054
ips+0.053
ics+0.052
Neg logit contributions
 oh-0.041
 dear-0.039
 Please-0.036
 Uncle-0.035
Run-0.034
Tokenboard
Token ID3526
Feature activation-2.54e+00

Pos logit contributions
 ease+0.016
ws+0.015
zip+0.015
ips+0.014
nob+0.014
Neg logit contributions
 oh-0.023
 dear-0.022
Run-0.021
 Please-0.021
 Uncle-0.020
Token so
Token ID523
Feature activation+1.67e+00

Pos logit contributions
 ease+0.071
ws+0.067
eds+0.064
zip+0.064
 surfaces+0.063
Neg logit contributions
 oh-0.050
 dear-0.045
 Please-0.045
 sorry-0.041
 Can-0.040
Token they
Token ID484
Feature activation-2.75e+00

Pos logit contributions
 oh+0.029
Run+0.027
 dear+0.027
 Please+0.026
 goat+0.024
Neg logit contributions
 ease-0.047
zip-0.044
 stairs-0.044
ws-0.044
 neatly-0.043
Token could
Token ID714
Feature activation-6.69e+00

Pos logit contributions
ips+0.077
eds+0.074
 ease+0.073
ws+0.072
 surfaces+0.072
Neg logit contributions
 oh-0.049
 Please-0.048
 dear-0.045
Run-0.041
 please-0.041
Token buy
Token ID2822
Feature activation-2.64e+00

Pos logit contributions
eds+0.168
ips+0.162
 surfaces+0.155
 stations+0.154
ge+0.153
Neg logit contributions
 please-0.146
 dear-0.134
 Please-0.132
 oh-0.124
 Uncle-0.124
Token some
Token ID617
Feature activation-2.54e+00

Pos logit contributions
 ease+0.068
ws+0.067
zip+0.066
ge+0.065
ips+0.064
Neg logit contributions
 oh-0.055
 dear-0.051
 Uncle-0.047
 Please-0.046
Run-0.045
Token chalk
Token ID30860
Feature activation-2.71e+00

Pos logit contributions
 ease+0.056
ws+0.054
zip+0.052
ips+0.051
nob+0.051
Neg logit contributions
 oh-0.062
 dear-0.059
 Please-0.056
Run-0.055
 Uncle-0.054
Token.
Token ID13
Feature activation-1.28e+00

Pos logit contributions
 ease+0.070
ws+0.064
 surfaces+0.062
zip+0.061
ge+0.060
Neg logit contributions
 oh-0.059
 dear-0.055
 Please-0.055
 Uncle-0.050
 Can-0.049
Token Lily
Token ID20037
Feature activation-1.43e+00

Pos logit contributions
 ease+0.036
ws+0.035
zip+0.033
ips+0.033
ge+0.033
Neg logit contributions
 oh-0.024
 dear-0.022
 Please-0.021
Run-0.020
 Uncle-0.020
Token and
Token ID290
Feature activation-1.67e+00

Pos logit contributions
ws+0.061
 ease+0.060
 driveway+0.060
eds+0.058
zip+0.057
Neg logit contributions
 oh-0.041
 dear-0.033
 sorry-0.033
 Uncle-0.032
 Please-0.032
Token worked
Token ID3111
Feature activation-2.25e+00

Pos logit contributions
eds+0.036
ws+0.036
ips+0.035
 ease+0.035
 channels+0.034
Neg logit contributions
 oh-0.041
Run-0.033
 gust-0.033
 dear-0.033
 Harry-0.033
Token until
Token ID1566
Feature activation-1.89e-01

Pos logit contributions
ws+0.064
 driveway+0.062
eds+0.061
 ease+0.061
zip+0.059
Neg logit contributions
 oh-0.040
 dear-0.034
 sorry-0.032
 gust-0.031
 Uncle-0.031
Token she
Token ID673
Feature activation-2.30e+00

Pos logit contributions
ws+0.006
zip+0.006
 ease+0.006
eds+0.005
nob+0.005
Neg logit contributions
 oh-0.003
 dear-0.003
 gust-0.003
 Uncle-0.003
 Please-0.003
Token managed
Token ID5257
Feature activation-2.82e+00

Pos logit contributions
 channels+0.063
ws+0.060
 surfaces+0.060
eds+0.060
 ease+0.059
Neg logit contributions
 oh-0.043
 dear-0.039
 Please-0.034
 named-0.034
 gust-0.032
Token to
Token ID284
Feature activation-6.68e+00

Pos logit contributions
zip+0.090
ws+0.088
 ramp+0.087
 driveway+0.087
|+0.087
Neg logit contributions
 oh-0.035
 dear-0.034
 Sally-0.034
 Uncle-0.033
 hello-0.030
Token solve
Token ID8494
Feature activation-3.34e+00

Pos logit contributions
ips+0.168
eds+0.157
ests+0.147
orns+0.147
 channels+0.144
Neg logit contributions
 oh-0.144
 dear-0.138
 please-0.138
 Please-0.129
 Harry-0.129
Token the
Token ID262
Feature activation-2.91e+00

Pos logit contributions
 driveway+0.102
ws+0.102
zip+0.100
pack+0.098
seek+0.097
Neg logit contributions
 oh-0.050
 Uncle-0.043
 dear-0.042
 gust-0.036
 Sally-0.035
Token puzzle
Token ID15027
Feature activation-3.97e+00

Pos logit contributions
 ease+0.051
ws+0.049
zip+0.048
ips+0.046
ge+0.043
Neg logit contributions
 oh-0.084
 dear-0.082
 Please-0.080
 gust-0.077
Run-0.076
Token.
Token ID13
Feature activation-1.91e+00

Pos logit contributions
ws+0.130
zip+0.127
 ease+0.124
eds+0.117
|+0.115
Neg logit contributions
 oh-0.052
 dear-0.048
 gust-0.044
 Uncle-0.043
 Please-0.043
Token She
Token ID1375
Feature activation-1.58e+00

Pos logit contributions
 ease+0.051
ws+0.051
ge+0.048
zip+0.048
ips+0.048
Neg logit contributions
 oh-0.036
 dear-0.035
 Please-0.033
 Uncle-0.032
Run-0.031
Token and
Token ID290
Feature activation-6.21e-01

Pos logit contributions
ws+0.098
ips+0.095
 ease+0.095
irs+0.093
zip+0.093
Neg logit contributions
 oh-0.046
 dear-0.044
 gust-0.037
 sorry-0.036
 Uncle-0.036
Token soon
Token ID2582
Feature activation+7.20e-01

Pos logit contributions
 ease+0.014
ws+0.014
ips+0.013
zip+0.013
 surfaces+0.013
Neg logit contributions
 oh-0.015
 dear-0.014
 Please-0.013
 Uncle-0.013
Run-0.013
Token it
Token ID340
Feature activation-2.57e+00

Pos logit contributions
 oh+0.016
 dear+0.015
 bye+0.014
 Uncle+0.014
 Please+0.014
Neg logit contributions
 ease-0.017
ws-0.017
zip-0.017
pack-0.016
ips-0.016
Token was
Token ID373
Feature activation-1.06e-01

Pos logit contributions
ws+0.068
 ease+0.068
ips+0.067
 surfaces+0.066
 channels+0.066
Neg logit contributions
 oh-0.048
 dear-0.045
Run-0.044
 Please-0.042
 gust-0.042
Token able
Token ID1498
Feature activation-3.58e-02

Pos logit contributions
ws+0.002
ips+0.002
 surfaces+0.002
zip+0.002
pack+0.002
Neg logit contributions
 oh-0.003
 dear-0.002
 sorry-0.002
 gust-0.002
 Uncle-0.002
Token to
Token ID284
Feature activation-6.67e+00

Pos logit contributions
ws+0.001
 ease+0.001
zip+0.001
ips+0.001
pack+0.001
Neg logit contributions
 oh-0.001
 dear-0.001
 Please-0.001
 Uncle-0.001
 gust-0.001
Token fly
Token ID6129
Feature activation-2.86e+00

Pos logit contributions
ips+0.171
 channels+0.165
 positions+0.155
oons+0.154
 stairs+0.151
Neg logit contributions
 dear-0.144
 sorry-0.139
 oh-0.137
 please-0.133
 goat-0.128
Token again
Token ID757
Feature activation-7.76e-01

Pos logit contributions
ws+0.054
ips+0.054
pack+0.053
 driveway+0.053
 ease+0.052
Neg logit contributions
 oh-0.075
 Please-0.069
 dear-0.069
 hello-0.066
 bye-0.066
Token.
Token ID13
Feature activation-1.43e+00

Pos logit contributions
ws+0.023
 ease+0.023
pack+0.022
zip+0.022
ips+0.022
Neg logit contributions
 oh-0.013
 Please-0.012
 dear-0.011
 Uncle-0.011
 gust-0.010
Token Ros
Token ID10018
Feature activation-1.42e+00

Pos logit contributions
 ease+0.037
ws+0.037
ips+0.034
zip+0.034
ge+0.034
Neg logit contributions
 oh-0.029
 dear-0.028
 Please-0.026
Run-0.025
 Uncle-0.025
Tokenie
Token ID494
Feature activation-3.45e-01

Pos logit contributions
 ease+0.046
 stairs+0.045
 ramp+0.045
 driveway+0.045
 bank+0.044
Neg logit contributions
heart-0.019
 dear-0.016
 oh-0.016
 goat-0.013
 insect-0.013
Token,
Token ID11
Feature activation+2.08e+00

Pos logit contributions
 ease+0.015
ws+0.014
ips+0.014
zip+0.014
 bank+0.014
Neg logit contributions
 oh-0.009
Run-0.008
 dear-0.008
 Please-0.008
 gust-0.007
Token Bob
Token ID5811
Feature activation+3.70e-01

Pos logit contributions
 oh+0.038
 dear+0.037
 Please+0.035
 gust+0.033
Run+0.032
Neg logit contributions
 ease-0.057
 stairs-0.052
ics-0.052
zip-0.052
 tops-0.051
Token and
Token ID290
Feature activation+1.49e+00

Pos logit contributions
 oh+0.006
 dear+0.006
 Please+0.006
 Uncle+0.006
 goat+0.006
Neg logit contributions
 ease-0.010
ws-0.010
 surfaces-0.010
ips-0.010
zip-0.010
Token Jay
Token ID9180
Feature activation-5.31e-01

Pos logit contributions
 oh+0.028
 dear+0.025
Run+0.024
 bye+0.024
 Uncle+0.024
Neg logit contributions
 ease-0.041
ws-0.041
ips-0.039
zip-0.039
nob-0.038
Token managed
Token ID5257
Feature activation-3.15e+00

Pos logit contributions
 ease+0.013
ws+0.012
ips+0.012
zip+0.012
 surfaces+0.012
Neg logit contributions
 oh-0.012
 dear-0.011
 Please-0.010
Run-0.010
 bye-0.010
Token to
Token ID284
Feature activation-6.66e+00

Pos logit contributions
zip+0.092
ips+0.091
ws+0.090
 surfaces+0.089
 channels+0.089
Neg logit contributions
 oh-0.047
 Uncle-0.047
 hello-0.044
 Harry-0.043
 dear-0.043
Token make
Token ID787
Feature activation-2.89e+00

Pos logit contributions
ips+0.163
 channels+0.144
oons+0.138
eds+0.138
ws+0.137
Neg logit contributions
 oh-0.143
 dear-0.143
 named-0.132
 please-0.131
 Harry-0.129
Token the
Token ID262
Feature activation-2.46e+00

Pos logit contributions
ws+0.080
zip+0.079
ips+0.077
nob+0.077
zag+0.076
Neg logit contributions
 oh-0.055
 dear-0.051
 Uncle-0.048
 sorry-0.045
 gust-0.042
Token most
Token ID749
Feature activation+6.53e-01

Pos logit contributions
ws+0.046
eds+0.045
zip+0.045
ips+0.045
ge+0.044
Neg logit contributions
 dear-0.069
 oh-0.066
 sorry-0.065
 gust-0.061
 Please-0.061
Token beautiful
Token ID4950
Feature activation-1.28e+00

Pos logit contributions
 oh+0.014
 dear+0.013
 Please+0.012
Run+0.012
 Uncle+0.012
Neg logit contributions
 ease-0.016
ws-0.016
zip-0.015
ips-0.015
 surfaces-0.015
Token castle
Token ID16669
Feature activation-2.03e+00

Pos logit contributions
ws+0.022
 ease+0.022
zip+0.021
ips+0.021
pack+0.020
Neg logit contributions
 oh-0.036
 dear-0.036
 gust-0.033
 Please-0.033
 Uncle-0.032
Token
Token ID241
Feature activation+2.55e+00

Pos logit contributions
 oh+0.050
 dear+0.048
 Please+0.045
Run+0.045
 Uncle+0.044
Neg logit contributions
 ease-0.049
ws-0.047
zip-0.045
ips-0.045
 surfaces-0.044
TokenMom
Token ID29252
Feature activation+1.68e+00

Pos logit contributions
 oh+0.058
 dear+0.056
 gust+0.052
 sorry+0.052
 bye+0.051
Neg logit contributions
 ease-0.058
zip-0.055
ws-0.054
ge-0.053
ips-0.053
Tokenmy
Token ID1820
Feature activation+2.75e+00

Pos logit contributions
 oh+0.036
 dear+0.033
Run+0.033
 Please+0.032
 gust+0.031
Neg logit contributions
 ease-0.041
ws-0.040
zip-0.039
pack-0.038
nob-0.038
Token,
Token ID11
Feature activation+3.93e+00

Pos logit contributions
 oh+0.059
Run+0.054
 gust+0.052
 dear+0.051
 Please+0.049
Neg logit contributions
 ease-0.065
zip-0.063
ws-0.062
nob-0.062
pack-0.062
Token what
Token ID644
Feature activation+8.08e-02

Pos logit contributions
 oh+0.073
Run+0.069
 dear+0.067
 gust+0.067
 Please+0.064
Neg logit contributions
 ease-0.100
ws-0.095
zip-0.094
pack-0.093
 stairs-0.092
Tokenâ
Token ID22940
Feature activation+4.19e+00

Pos logit contributions
 oh+0.002
 dear+0.002
Run+0.002
 Please+0.002
 gust+0.002
Neg logit contributions
 ease-0.001
ws-0.001
zip-0.001
ips-0.001
pack-0.001
Token
Token ID26391
Feature activation+4.01e+00

Pos logit contributions
 oh+0.114
 dear+0.111
 gust+0.102
 Please+0.101
 Uncle+0.101
Neg logit contributions
nob-0.076
ws-0.075
zip-0.075
pack-0.074
 ease-0.072
Token
Token ID8151
Feature activation-2.52e-01

Pos logit contributions
 oh+0.101
 dear+0.095
Run+0.091
 gust+0.088
 bye+0.087
Neg logit contributions
 ease-0.079
zip-0.079
ips-0.077
pack-0.076
nob-0.075
Tokens
Token ID82
Feature activation+2.64e+00

Pos logit contributions
 ease+0.005
ws+0.005
 surfaces+0.005
zip+0.005
nob+0.005
Neg logit contributions
 oh-0.006
 dear-0.006
 Please-0.006
Run-0.006
 Uncle-0.006
Token this
Token ID428
Feature activation-1.30e+00

Pos logit contributions
 oh+0.060
Run+0.058
 Please+0.056
 gust+0.055
 dear+0.055
Neg logit contributions
ics-0.057
 ease-0.057
pack-0.055
ws-0.055
nob-0.054
Token?
Token ID30
Feature activation+9.97e-01

Pos logit contributions
 ease+0.026
ws+0.025
zip+0.024
ips+0.024
 surfaces+0.023
Neg logit contributions
 oh-0.034
 dear-0.033
 Please-0.031
Run-0.031
 Uncle-0.030
Token a
Token ID257
Feature activation+3.20e+00

Pos logit contributions
 oh+0.023
 dear+0.022
 Please+0.020
Run+0.020
 Uncle+0.020
Neg logit contributions
 ease-0.029
ws-0.028
zip-0.027
ips-0.027
nob-0.027
Token time
Token ID640
Feature activation+2.17e+00

Pos logit contributions
 oh+0.098
 dear+0.090
 Please+0.088
Run+0.088
 Can+0.086
Neg logit contributions
 ease-0.051
ws-0.048
pack-0.044
nob-0.044
zip-0.043
Token,
Token ID11
Feature activation+1.90e+00

Pos logit contributions
 oh+0.043
 dear+0.041
Run+0.040
 goat+0.039
 Please+0.039
Neg logit contributions
 ease-0.055
ics-0.052
zag-0.052
ips-0.051
 stairs-0.051
Token there
Token ID612
Feature activation-2.11e+00

Pos logit contributions
 oh+0.041
 dear+0.040
 Please+0.037
Run+0.036
 Uncle+0.036
Neg logit contributions
 ease-0.046
ws-0.044
zip-0.043
ips-0.042
pack-0.042
Token was
Token ID373
Feature activation+3.25e+00

Pos logit contributions
ws+0.050
 ease+0.049
zip+0.046
ips+0.046
 surfaces+0.046
Neg logit contributions
 oh-0.048
 dear-0.044
Run-0.042
 Uncle-0.041
 Please-0.041
Token a
Token ID257
Feature activation+4.19e+00

Pos logit contributions
 oh+0.082
 dear+0.081
 Please+0.080
 gru+0.077
 bye+0.077
Neg logit contributions
 ease-0.068
ws-0.058
zip-0.057
ics-0.057
nob-0.056
Token big
Token ID1263
Feature activation+2.51e+00

Pos logit contributions
 oh+0.099
heart+0.097
Run+0.096
 Please+0.091
 gru+0.090
Neg logit contributions
 ramp-0.081
 stairs-0.075
 block-0.072
 blocks-0.072
 build-0.071
Token bird
Token ID6512
Feature activation+2.29e+00

Pos logit contributions
 oh+0.068
Run+0.063
 Please+0.061
 gru+0.060
 importantly+0.058
Neg logit contributions
 blocks-0.047
 ramp-0.047
 driveway-0.045
 stairs-0.045
 build-0.044
Token called
Token ID1444
Feature activation+3.16e+00

Pos logit contributions
 oh+0.035
 dear+0.033
Run+0.031
 Please+0.030
 Uncle+0.029
Neg logit contributions
 ease-0.068
zip-0.066
ws-0.066
nob-0.066
eds-0.066
Token an
Token ID281
Feature activation+1.65e+00

Pos logit contributions
 oh+0.069
Run+0.064
heart+0.062
 dear+0.060
 Please+0.060
Neg logit contributions
 ease-0.072
nob-0.066
 stairs-0.065
zip-0.064
ics-0.063
Token ostr
Token ID42387
Feature activation+3.76e+00

Pos logit contributions
 oh+0.033
 Please+0.033
Run+0.032
 Can+0.029
 dear+0.028
Neg logit contributions
 ease-0.039
 bank-0.037
 tops-0.037
ws-0.036
sey-0.036
Token
Token ID198
Feature activation-2.58e+00

Pos logit contributions
 ease+0.029
ws+0.028
zip+0.026
ips+0.026
 surfaces+0.025
Neg logit contributions
 oh-0.037
 dear-0.035
 Please-0.033
Run-0.033
 Uncle-0.032
Token
Token ID198
Feature activation-6.74e-01

Pos logit contributions
 ease+0.070
ws+0.067
 surfaces+0.066
nob+0.064
 rebuild+0.063
Neg logit contributions
 oh-0.049
 dear-0.046
 Uncle-0.046
Run-0.046
 Please-0.046
Token"
Token ID1
Feature activation+4.57e-01

Pos logit contributions
 ease+0.014
ws+0.013
zip+0.013
ips+0.013
pack+0.012
Neg logit contributions
 oh-0.017
 dear-0.017
 Please-0.016
Run-0.015
 Uncle-0.015
TokenLook
Token ID8567
Feature activation+2.95e+00

Pos logit contributions
 oh+0.012
 dear+0.011
 Please+0.010
Run+0.010
 Uncle+0.010
Neg logit contributions
 ease-0.010
ws-0.009
zip-0.009
ips-0.009
nob-0.008
Token,
Token ID11
Feature activation+4.76e+00

Pos logit contributions
Run+0.058
 gru+0.056
 oh+0.054
 Please+0.054
 importantly+0.051
Neg logit contributions
 ease-0.071
ws-0.071
ips-0.067
zip-0.067
cast-0.066
Token Ben
Token ID3932
Feature activation+4.38e+00

Pos logit contributions
 gru+0.108
 oh+0.104
 gust+0.103
Run+0.102
heart+0.100
Neg logit contributions
 ease-0.093
ws-0.091
 become-0.086
 stairs-0.086
zip-0.085
Token,
Token ID11
Feature activation+4.17e+00

Pos logit contributions
 oh+0.105
Run+0.098
 gust+0.093
 Please+0.091
 dear+0.089
Neg logit contributions
ws-0.092
nob-0.086
zip-0.086
 ease-0.086
 build-0.084
Token a
Token ID257
Feature activation+3.08e+00

Pos logit contributions
 gru+0.095
Run+0.091
 oh+0.091
 gust+0.091
 dear+0.091
Neg logit contributions
 ease-0.080
 build-0.080
ws-0.077
zip-0.077
 stairs-0.077
Token big
Token ID1263
Feature activation+1.91e+00

Pos logit contributions
 oh+0.078
 dear+0.070
Run+0.070
 Please+0.069
 Uncle+0.065
Neg logit contributions
 ease-0.062
 stairs-0.059
 ramp-0.058
 driveway-0.057
zip-0.055
Token red
Token ID2266
Feature activation+3.49e-01

Pos logit contributions
 oh+0.050
 Please+0.046
Run+0.045
 importantly+0.045
 dear+0.045
Neg logit contributions
 ease-0.035
 ramp-0.034
 stairs-0.034
 driveway-0.033
 bank-0.033
Token ball
Token ID2613
Feature activation+4.34e-01

Pos logit contributions
 oh+0.009
 dear+0.009
 Please+0.008
Run+0.008
 importantly+0.008
Neg logit contributions
 ease-0.006
 bank-0.006
ws-0.006
 stairs-0.006
 driveway-0.006
Token a
Token ID257
Feature activation+3.36e+00

Pos logit contributions
 oh+0.030
 dear+0.028
 Please+0.026
Run+0.026
 Uncle+0.025
Neg logit contributions
 ease-0.038
ws-0.037
zip-0.035
ips-0.035
 surfaces-0.035
Token time
Token ID640
Feature activation+2.56e+00

Pos logit contributions
 oh+0.101
 dear+0.093
 Please+0.091
Run+0.090
 Can+0.088
Neg logit contributions
 ease-0.056
ws-0.053
nob-0.049
pack-0.048
zip-0.047
Token there
Token ID612
Feature activation-1.25e+00

Pos logit contributions
 oh+0.054
 dear+0.052
 goat+0.052
Run+0.051
 Please+0.051
Neg logit contributions
 ease-0.060
zag-0.056
ics-0.056
 stairs-0.056
ips-0.054
Token was
Token ID373
Feature activation+3.68e+00

Pos logit contributions
 ease+0.026
ws+0.026
zip+0.024
ips+0.024
 surfaces+0.024
Neg logit contributions
 oh-0.032
 dear-0.030
Run-0.028
 Please-0.028
 Uncle-0.028
Token a
Token ID257
Feature activation+4.36e+00

Pos logit contributions
 oh+0.096
 dear+0.095
 Please+0.094
 gru+0.090
 bye+0.090
Neg logit contributions
 ease-0.076
ws-0.061
ics-0.061
nob-0.060
 tops-0.060
Token little
Token ID1310
Feature activation+5.06e+00

Pos logit contributions
 oh+0.105
heart+0.096
Run+0.095
 Please+0.093
 dear+0.090
Neg logit contributions
 ramp-0.090
 stairs-0.084
 driveway-0.079
 build-0.078
 bank-0.077
Token pig
Token ID12967
Feature activation+2.64e+00

Pos logit contributions
 oh+0.125
 gru+0.109
 Please+0.106
 dear+0.105
Run+0.105
Neg logit contributions
 ramp-0.101
 stairs-0.100
 blocks-0.099
 block-0.095
 cush-0.092
Token named
Token ID3706
Feature activation+1.39e+00

Pos logit contributions
 oh+0.042
 dear+0.038
Run+0.037
 Please+0.036
 gust+0.035
Neg logit contributions
 ease-0.079
ws-0.077
zip-0.076
pack-0.076
nob-0.076
Token Pe
Token ID2631
Feature activation+2.64e+00

Pos logit contributions
 oh+0.024
 dear+0.023
Run+0.022
 Please+0.021
 gust+0.020
Neg logit contributions
 ease-0.040
ws-0.038
zip-0.037
nob-0.036
pack-0.036
Tokenppa
Token ID44989
Feature activation+2.40e+00

Pos logit contributions
 oh+0.064
 dear+0.058
 goat+0.055
 Uncle+0.055
 Please+0.053
Neg logit contributions
 ease-0.058
 surfaces-0.056
 stairs-0.055
 tops-0.054
 channels-0.054
Token.
Token ID13
Feature activation-1.61e+00

Pos logit contributions
 oh+0.052
 dear+0.049
 Please+0.046
 Uncle+0.045
 bye+0.044
Neg logit contributions
 ease-0.059
ws-0.056
zip-0.055
ips-0.055
 surfaces-0.054
Token to
Token ID284
Feature activation-2.62e-01

Pos logit contributions
Run+0.043
 oh+0.041
 dear+0.040
 goat+0.038
 Uncle+0.038
Neg logit contributions
 ease-0.049
 neatly-0.046
 become-0.045
ws-0.044
 aside-0.044
Token Jill
Token ID22772
Feature activation+1.34e+00

Pos logit contributions
 ease+0.008
ws+0.008
zip+0.007
ips+0.007
nob+0.007
Neg logit contributions
 oh-0.004
 dear-0.004
 Please-0.004
Run-0.004
 Uncle-0.004
Token,
Token ID11
Feature activation+3.13e+00

Pos logit contributions
 oh+0.018
 dear+0.017
 Please+0.015
Run+0.014
 Uncle+0.014
Neg logit contributions
 ease-0.044
ws-0.043
zip-0.042
ips-0.041
 surfaces-0.041
Token "
Token ID366
Feature activation+6.83e-01

Pos logit contributions
 dear+0.058
 oh+0.057
Run+0.055
 bye+0.051
 gust+0.050
Neg logit contributions
 ease-0.078
 neatly-0.071
 stairs-0.070
ge-0.069
zip-0.069
TokenLook
Token ID8567
Feature activation+3.10e+00

Pos logit contributions
 oh+0.016
 dear+0.015
 Please+0.014
Run+0.014
 Uncle+0.014
Neg logit contributions
 ease-0.016
ws-0.015
zip-0.015
ips-0.015
nob-0.014
Token,
Token ID11
Feature activation+5.00e+00

Pos logit contributions
Run+0.064
 oh+0.057
 gru+0.056
 Please+0.056
 dear+0.054
Neg logit contributions
 ease-0.076
ws-0.071
cast-0.070
zip-0.070
ips-0.069
Token our
Token ID674
Feature activation-2.69e-02

Pos logit contributions
Run+0.088
 oh+0.087
 gru+0.081
 gust+0.080
 dear+0.079
Neg logit contributions
 ease-0.129
 build-0.124
 blocks-0.119
 stairs-0.118
ws-0.117
Token bathing
Token ID39153
Feature activation-1.10e+00

Pos logit contributions
 ease+0.001
ws+0.001
zip+0.000
ips+0.000
nob+0.000
Neg logit contributions
 oh-0.001
 dear-0.001
 Please-0.001
Run-0.001
 bye-0.001
Token suits
Token ID14803
Feature activation-1.13e+00

Pos logit contributions
 ease+0.022
ws+0.021
zip+0.019
ips+0.019
nob+0.019
Neg logit contributions
 oh-0.029
 dear-0.028
 Please-0.027
Run-0.026
 Uncle-0.026
Token match
Token ID2872
Feature activation-2.04e+00

Pos logit contributions
 ease+0.027
ws+0.027
zip+0.025
nob+0.025
 surfaces+0.025
Neg logit contributions
 oh-0.024
 dear-0.024
 Please-0.022
 bye-0.021
Run-0.021
Token!"
Token ID2474
Feature activation+2.40e-01

Pos logit contributions
 ease+0.053
ws+0.050
zip+0.049
ips+0.048
nob+0.048
Neg logit contributions
 oh-0.042
 dear-0.041
 Please-0.037
 bye-0.035
 Uncle-0.035
Token a
Token ID257
Feature activation+6.11e-01

Pos logit contributions
 oh+0.037
 dear+0.036
 Please+0.035
Run+0.034
 Can+0.032
Neg logit contributions
 ease-0.037
nob-0.034
ws-0.034
zip-0.034
 stairs-0.033
Token noise
Token ID7838
Feature activation+2.22e+00

Pos logit contributions
 oh+0.012
 dear+0.012
 Please+0.011
 Uncle+0.011
Run+0.011
Neg logit contributions
 stairs-0.014
 ease-0.014
 ramp-0.014
 bank-0.014
 surfaces-0.014
Token.
Token ID13
Feature activation+1.18e+00

Pos logit contributions
 dear+0.065
 oh+0.063
 Please+0.059
 bye+0.057
 Uncle+0.056
Neg logit contributions
nob-0.037
 ease-0.036
pack-0.035
ws-0.035
zip-0.035
Token It
Token ID632
Feature activation-6.50e-01

Pos logit contributions
 oh+0.029
 dear+0.027
Run+0.027
heart+0.025
 goat+0.024
Neg logit contributions
nob-0.024
 stairs-0.023
 surfaces-0.022
zip-0.022
 build-0.022
Token was
Token ID373
Feature activation+2.26e+00

Pos logit contributions
 ease+0.012
ws+0.011
zip+0.011
nob+0.011
 surfaces+0.010
Neg logit contributions
 oh-0.018
 dear-0.018
 Please-0.017
Run-0.016
 Uncle-0.016
Token a
Token ID257
Feature activation+1.80e+00

Pos logit contributions
 dear+0.059
 oh+0.058
Run+0.056
 Please+0.056
 bye+0.053
Neg logit contributions
 ease-0.045
nob-0.040
 stairs-0.040
 become-0.039
ws-0.039
Token monster
Token ID9234
Feature activation+3.07e+00

Pos logit contributions
 oh+0.044
 dear+0.041
Run+0.040
 Please+0.040
heart+0.040
Neg logit contributions
 stairs-0.035
 ramp-0.035
 ease-0.033
 driveway-0.033
 bank-0.032
Token!
Token ID0
Feature activation+5.65e-01

Pos logit contributions
 dear+0.073
 oh+0.073
 Uncle+0.066
 Please+0.066
 bye+0.065
Neg logit contributions
nob-0.062
zag-0.061
 ramp-0.060
ge-0.060
 ease-0.059
Token The
Token ID383
Feature activation-1.05e-02

Pos logit contributions
 oh+0.013
 dear+0.012
heart+0.012
Run+0.011
 importantly+0.011
Neg logit contributions
 stairs-0.012
zip-0.011
 build-0.011
nob-0.011
 ease-0.011
Token monster
Token ID9234
Feature activation+1.78e+00

Pos logit contributions
 ease+0.000
ws+0.000
zip+0.000
ips+0.000
nob+0.000
Neg logit contributions
 oh-0.000
 dear-0.000
 Please-0.000
Run-0.000
 Uncle-0.000
Token was
Token ID373
Feature activation+1.08e+00

Pos logit contributions
 oh+0.049
 dear+0.048
 Please+0.045
 bye+0.044
 Uncle+0.044
Neg logit contributions
nob-0.029
 ease-0.029
zag-0.029
zip-0.028
 bank-0.028
Token grace
Token ID11542
Feature activation+8.34e-01

Pos logit contributions
 ease+0.013
ws+0.013
zip+0.012
ips+0.012
 surfaces+0.012
Neg logit contributions
 oh-0.023
 dear-0.022
 Please-0.021
Run-0.021
 Uncle-0.021
Tokenfully
Token ID2759
Feature activation-3.90e-01

Pos logit contributions
 oh+0.016
 dear+0.015
 Please+0.014
Run+0.014
 bye+0.014
Neg logit contributions
 ease-0.022
ws-0.022
zip-0.021
ips-0.021
 surfaces-0.020
Token in
Token ID287
Feature activation-8.88e-01

Pos logit contributions
ws+0.010
 ease+0.010
zip+0.010
ips+0.010
pack+0.010
Neg logit contributions
 oh-0.007
 dear-0.007
 Please-0.007
 bye-0.007
 hello-0.006
Token the
Token ID262
Feature activation-5.46e-01

Pos logit contributions
ws+0.022
zip+0.021
 ease+0.021
pack+0.021
ics+0.021
Neg logit contributions
 oh-0.019
 dear-0.018
 Uncle-0.017
 Please-0.017
Run-0.017
Token sky
Token ID6766
Feature activation+4.92e-01

Pos logit contributions
 ease+0.011
ws+0.010
zip+0.010
ips+0.010
nob+0.010
Neg logit contributions
 oh-0.015
 dear-0.014
 Please-0.013
Run-0.013
 Uncle-0.013
Token and
Token ID290
Feature activation+9.66e-01

Pos logit contributions
 oh+0.009
 dear+0.008
 Please+0.008
 Uncle+0.007
 bye+0.007
Neg logit contributions
ws-0.014
 ease-0.014
zip-0.013
ips-0.013
pack-0.013
Token she
Token ID673
Feature activation-2.78e-01

Pos logit contributions
 oh+0.021
 dear+0.019
 Please+0.018
 bye+0.018
Run+0.018
Neg logit contributions
 ease-0.024
ws-0.024
ips-0.023
zip-0.023
pack-0.022
Token knew
Token ID2993
Feature activation+1.03e+00

Pos logit contributions
ws+0.007
 ease+0.007
 surfaces+0.007
ips+0.007
pack+0.007
Neg logit contributions
 oh-0.005
 dear-0.005
 bye-0.005
 Please-0.005
Run-0.005
Token that
Token ID326
Feature activation+1.92e+00

Pos logit contributions
 oh+0.019
 dear+0.017
 Please+0.016
Run+0.016
 Uncle+0.015
Neg logit contributions
 ease-0.029
ws-0.029
zip-0.028
ips-0.027
 surfaces-0.027
Token the
Token ID262
Feature activation-1.46e+00

Pos logit contributions
 oh+0.041
 dear+0.039
Run+0.037
 Please+0.037
 bye+0.035
Neg logit contributions
 ease-0.047
ws-0.045
zip-0.043
ips-0.043
 surfaces-0.042
Token crane
Token ID41175
Feature activation-1.67e+00

Pos logit contributions
 ease+0.037
ws+0.037
zip+0.036
ips+0.036
nob+0.035
Neg logit contributions
 oh-0.030
 dear-0.029
 Please-0.026
Run-0.026
 sorry-0.025
Token doctor
Token ID6253
Feature activation+5.34e-01

Pos logit contributions
 ease+0.029
ws+0.028
zip+0.027
ips+0.027
 surfaces+0.026
Neg logit contributions
 oh-0.028
 dear-0.027
 Uncle-0.025
 Please-0.025
Run-0.024
Token was
Token ID373
Feature activation+1.50e+00

Pos logit contributions
 oh+0.012
 dear+0.011
Run+0.011
 Please+0.010
 gust+0.010
Neg logit contributions
 ease-0.013
ws-0.012
zip-0.012
nob-0.012
pack-0.012
Token very
Token ID845
Feature activation+2.13e+00

Pos logit contributions
 oh+0.031
 dear+0.029
Run+0.028
 Please+0.028
 bye+0.027
Neg logit contributions
 ease-0.038
ws-0.037
zip-0.035
ips-0.035
nob-0.035
Token nice
Token ID3621
Feature activation+1.85e+00

Pos logit contributions
 oh+0.036
Run+0.033
 dear+0.032
 Please+0.030
 gust+0.030
Neg logit contributions
 ease-0.059
ws-0.058
zip-0.057
pack-0.057
 neatly-0.056
Token and
Token ID290
Feature activation+1.26e+00

Pos logit contributions
 oh+0.041
 dear+0.038
Run+0.037
 Please+0.036
 gust+0.035
Neg logit contributions
 ease-0.044
ws-0.043
ips-0.041
zip-0.041
 surfaces-0.041
Token talked
Token ID6619
Feature activation+6.98e-01

Pos logit contributions
 oh+0.025
Run+0.024
 dear+0.024
 Please+0.023
 gust+0.022
Neg logit contributions
 ease-0.031
ws-0.029
zip-0.029
 neatly-0.028
 rebuild-0.028
Token a
Token ID257
Feature activation+1.71e+00

Pos logit contributions
 oh+0.015
 dear+0.014
 Please+0.013
 Uncle+0.013
Run+0.013
Neg logit contributions
 ease-0.018
ws-0.017
zip-0.017
ips-0.016
 surfaces-0.016
Token lot
Token ID1256
Feature activation+5.86e-01

Pos logit contributions
 oh+0.044
 dear+0.042
 Please+0.040
Run+0.040
 Uncle+0.039
Neg logit contributions
 ease-0.035
ws-0.033
zip-0.032
ips-0.031
 surfaces-0.031
Token with
Token ID351
Feature activation+6.79e-01

Pos logit contributions
 oh+0.013
 dear+0.013
 Please+0.012
Run+0.012
 Uncle+0.012
Neg logit contributions
 ease-0.014
ws-0.013
zip-0.013
ips-0.013
 surfaces-0.012
Token them
Token ID606
Feature activation+3.21e-01

Pos logit contributions
 oh+0.015
 dear+0.015
 Uncle+0.014
 Please+0.014
Run+0.013
Neg logit contributions
ws-0.016
 ease-0.016
zip-0.015
ips-0.015
pack-0.015
Token.
Token ID13
Feature activation-6.57e-01

Pos logit contributions
 oh+0.007
 dear+0.007
 Please+0.006
 Uncle+0.006
 sorry+0.006
Neg logit contributions
 ease-0.008
ws-0.008
zip-0.008
ips-0.007
 surfaces-0.007
Token,
Token ID11
Feature activation+2.18e+00

Pos logit contributions
Run+0.035
 oh+0.030
 dear+0.030
 goat+0.030
 Uncle+0.029
Neg logit contributions
 ease-0.035
 aside-0.033
ws-0.033
 become-0.033
 rebuild-0.033
Token "
Token ID366
Feature activation+2.10e-01

Pos logit contributions
 dear+0.039
 oh+0.039
Run+0.038
heart+0.036
 gust+0.034
Neg logit contributions
 ease-0.054
 neatly-0.051
zip-0.051
ws-0.051
 stairs-0.050
TokenThis
Token ID1212
Feature activation-1.85e+00

Pos logit contributions
 oh+0.005
 dear+0.004
 Please+0.004
Run+0.004
 Uncle+0.004
Neg logit contributions
 ease-0.005
ws-0.005
zip-0.005
ips-0.005
 surfaces-0.005
Token candy
Token ID18550
Feature activation-1.88e+00

Pos logit contributions
 ease+0.036
ws+0.035
zip+0.033
ips+0.033
 surfaces+0.032
Neg logit contributions
 oh-0.050
 dear-0.047
 Please-0.045
Run-0.045
 Uncle-0.044
Token is
Token ID318
Feature activation+1.01e+00

Pos logit contributions
 ease+0.040
ws+0.038
zip+0.036
ge+0.036
ips+0.036
Neg logit contributions
 oh-0.048
 dear-0.047
 Please-0.044
 bye-0.042
 Uncle-0.041
Token too
Token ID1165
Feature activation+1.80e+00

Pos logit contributions
 oh+0.023
 dear+0.022
Run+0.021
 Please+0.021
 bye+0.020
Neg logit contributions
 ease-0.023
ws-0.023
zip-0.021
nob-0.021
ips-0.021
Token sour
Token ID11348
Feature activation+1.77e+00

Pos logit contributions
 oh+0.044
Run+0.039
 dear+0.038
 Please+0.037
 Uncle+0.036
Neg logit contributions
ws-0.039
 ease-0.038
zip-0.036
 neatly-0.036
ge-0.035
Token for
Token ID329
Feature activation-4.30e-01

Pos logit contributions
 oh+0.037
 dear+0.034
Run+0.034
 Please+0.032
 gru+0.032
Neg logit contributions
 ease-0.043
ws-0.043
ips-0.041
 surfaces-0.040
zip-0.040
Token me
Token ID502
Feature activation-7.24e-01

Pos logit contributions
ws+0.012
 ease+0.012
nob+0.012
zip+0.012
zag+0.012
Neg logit contributions
 oh-0.008
 dear-0.007
 Please-0.006
 Uncle-0.006
 sorry-0.006
Token."
Token ID526
Feature activation-1.63e-01

Pos logit contributions
 ease+0.018
ws+0.017
zip+0.017
 surfaces+0.016
ips+0.016
Neg logit contributions
 oh-0.016
 dear-0.015
 Please-0.014
 Uncle-0.014
 bye-0.014
Token Lily
Token ID20037
Feature activation+2.19e-01

Pos logit contributions
 ease+0.005
ws+0.005
ips+0.005
zip+0.005
ge+0.005
Neg logit contributions
 oh-0.003
 dear-0.003
 Please-0.002
 Uncle-0.002
 bye-0.002
Token with
Token ID351
Feature activation-7.18e-01

Pos logit contributions
 oh+0.014
 dear+0.014
 Please+0.013
Run+0.013
 Uncle+0.013
Neg logit contributions
 ease-0.010
ws-0.010
zip-0.009
ips-0.009
 surfaces-0.009
Token his
Token ID465
Feature activation-5.69e-01

Pos logit contributions
ws+0.017
 ease+0.016
zip+0.016
ips+0.016
pack+0.016
Neg logit contributions
 oh-0.017
 dear-0.016
 Uncle-0.015
 Please-0.014
 bye-0.014
Token pillow
Token ID28774
Feature activation-9.88e-01

Pos logit contributions
 ease+0.014
ws+0.013
zip+0.013
ips+0.013
nob+0.012
Neg logit contributions
 oh-0.013
 dear-0.012
 Please-0.011
Run-0.011
 Uncle-0.011
Token,
Token ID11
Feature activation+1.13e+00

Pos logit contributions
 ease+0.024
ws+0.023
zip+0.022
ips+0.022
 surfaces+0.022
Neg logit contributions
 oh-0.022
 dear-0.021
 Please-0.019
 Uncle-0.019
 bye-0.019
Token and
Token ID290
Feature activation+1.22e+00

Pos logit contributions
 oh+0.021
 dear+0.020
 Please+0.018
Run+0.018
 Uncle+0.017
Neg logit contributions
 ease-0.032
ws-0.030
zip-0.030
ips-0.029
 surfaces-0.029
Token was
Token ID373
Feature activation+1.77e+00

Pos logit contributions
 oh+0.026
 dear+0.025
Run+0.024
 Please+0.023
 Uncle+0.022
Neg logit contributions
 ease-0.030
ws-0.029
zip-0.028
 surfaces-0.027
nob-0.027
Token sad
Token ID6507
Feature activation+2.21e+00

Pos logit contributions
 oh+0.038
 dear+0.035
 Please+0.033
 Uncle+0.032
 sorry+0.032
Neg logit contributions
 ease-0.044
ws-0.043
zip-0.042
ips-0.041
 surfaces-0.041
Token.
Token ID13
Feature activation-1.73e+00

Pos logit contributions
 oh+0.045
 dear+0.042
 Please+0.040
Run+0.040
 Uncle+0.038
Neg logit contributions
 ease-0.057
ws-0.055
zip-0.053
ips-0.052
 surfaces-0.052
Token His
Token ID2399
Feature activation-7.48e-01

Pos logit contributions
 ease+0.048
ws+0.048
zip+0.046
pack+0.045
ics+0.045
Neg logit contributions
 oh-0.032
 dear-0.030
 Please-0.028
 Uncle-0.027
Run-0.026
Token mom
Token ID1995
Feature activation+4.66e-01

Pos logit contributions
 ease+0.018
ws+0.018
zip+0.017
ips+0.017
ge+0.017
Neg logit contributions
 oh-0.016
 dear-0.016
 Please-0.015
 Uncle-0.014
 bye-0.014
Token saw
Token ID2497
Feature activation+6.64e-01

Pos logit contributions
 oh+0.010
 dear+0.010
 Please+0.009
Run+0.009
 Uncle+0.009
Neg logit contributions
 ease-0.011
ws-0.011
zip-0.010
ips-0.010
 surfaces-0.010
Token
Token ID198
Feature activation-1.76e+00

Pos logit contributions
 ease+0.012
ws+0.012
zip+0.012
ips+0.011
 surfaces+0.011
Neg logit contributions
 oh-0.010
 dear-0.009
Run-0.008
 Please-0.008
 bye-0.008
Token
Token ID198
Feature activation-6.46e-01

Pos logit contributions
ws+0.047
 surfaces+0.046
 ease+0.046
ips+0.043
nob+0.043
Neg logit contributions
 Uncle-0.033
 oh-0.033
 dear-0.032
 Please-0.032
Run-0.030
TokenIt
Token ID1026
Feature activation-6.10e-01

Pos logit contributions
 ease+0.015
ws+0.014
 surfaces+0.013
ips+0.013
nob+0.013
Neg logit contributions
 oh-0.015
 dear-0.015
 Please-0.014
Run-0.014
 Uncle-0.014
Token was
Token ID373
Feature activation+2.24e+00

Pos logit contributions
 ease+0.015
ws+0.015
ips+0.014
pack+0.014
ge+0.014
Neg logit contributions
 oh-0.013
 dear-0.012
Run-0.012
 Please-0.012
 gust-0.011
Token time
Token ID640
Feature activation+6.08e-01

Pos logit contributions
 oh+0.055
 dear+0.052
Run+0.051
 Please+0.050
 bye+0.048
Neg logit contributions
 ease-0.051
ws-0.045
zip-0.043
 stairs-0.043
nob-0.043
Token to
Token ID284
Feature activation-2.70e+00

Pos logit contributions
 oh+0.016
 dear+0.015
Run+0.015
 Please+0.014
 gust+0.014
Neg logit contributions
 ease-0.012
ws-0.012
ips-0.011
pack-0.011
 stairs-0.011
Token go
Token ID467
Feature activation-1.49e+00

Pos logit contributions
ws+0.054
ips+0.054
 ease+0.054
zip+0.052
ge+0.052
Neg logit contributions
 oh-0.065
 dear-0.065
 bye-0.063
 Uncle-0.063
 sorry-0.061
Token inside
Token ID2641
Feature activation-3.85e-03

Pos logit contributions
ws+0.030
 ease+0.029
 surfaces+0.027
zip+0.026
nob+0.026
Neg logit contributions
 oh-0.037
 dear-0.036
 bye-0.036
 Uncle-0.036
 hello-0.035
Token.
Token ID13
Feature activation-2.17e-01

Pos logit contributions
 ease+0.000
ws+0.000
zip+0.000
ips+0.000
nob+0.000
Neg logit contributions
 oh-0.000
 dear-0.000
 Please-0.000
Run-0.000
 Uncle-0.000
Token But
Token ID887
Feature activation+1.87e+00

Pos logit contributions
 ease+0.006
zip+0.006
 driveway+0.006
 stairs+0.005
ws+0.005
Neg logit contributions
 oh-0.004
Run-0.004
 dear-0.004
 goat-0.003
 gust-0.003
Token Miss
Token ID4544
Feature activation+4.26e+00

Pos logit contributions
 oh+0.038
Run+0.038
 dear+0.037
 Please+0.035
 gru+0.035
Neg logit contributions
 ease-0.045
zip-0.041
 neatly-0.041
pack-0.040
ge-0.040
Token quickly
Token ID2952
Feature activation-1.45e-01

Pos logit contributions
 oh+0.010
 dear+0.009
Run+0.009
 Please+0.009
 Uncle+0.009
Neg logit contributions
 ease-0.011
ws-0.010
ips-0.010
zip-0.010
 surfaces-0.010
Token grabbed
Token ID13176
Feature activation-2.62e+00

Pos logit contributions
 ease+0.003
ws+0.003
ips+0.003
zip+0.003
 surfaces+0.003
Neg logit contributions
 oh-0.003
 dear-0.003
Run-0.003
 Please-0.003
 Uncle-0.003
Token the
Token ID262
Feature activation-1.76e+00

Pos logit contributions
 ease+0.059
ws+0.057
zip+0.055
ips+0.055
 surfaces+0.054
Neg logit contributions
 oh-0.062
 dear-0.059
Run-0.055
 Please-0.055
 Uncle-0.054
Token hose
Token ID31489
Feature activation-2.15e+00

Pos logit contributions
 ease+0.039
ws+0.037
zip+0.036
ips+0.035
 surfaces+0.035
Neg logit contributions
 oh-0.043
 dear-0.041
 Please-0.039
Run-0.038
 Uncle-0.037
Token and
Token ID290
Feature activation+1.59e-02

Pos logit contributions
 ease+0.054
ws+0.052
zip+0.051
ips+0.050
 surfaces+0.049
Neg logit contributions
 oh-0.045
 dear-0.042
 Please-0.040
Run-0.039
 Uncle-0.038
Token started
Token ID2067
Feature activation-4.37e-01

Pos logit contributions
 oh+0.000
 dear+0.000
Run+0.000
 Please+0.000
 Uncle+0.000
Neg logit contributions
 ease-0.000
ws-0.000
ips-0.000
zip-0.000
 surfaces-0.000
Token spraying
Token ID40170
Feature activation-1.97e+00

Pos logit contributions
ws+0.011
 ease+0.011
 surfaces+0.011
zip+0.011
ips+0.010
Neg logit contributions
 oh-0.009
 Uncle-0.008
 hello-0.008
 sorry-0.007
 dear-0.007
Token water
Token ID1660
Feature activation-1.98e+00

Pos logit contributions
 ease+0.043
ws+0.042
pack+0.041
zip+0.041
ge+0.040
Neg logit contributions
 oh-0.047
 dear-0.043
Run-0.042
 Uncle-0.041
 Please-0.041
Token on
Token ID319
Feature activation-3.39e-01

Pos logit contributions
 ease+0.046
ws+0.045
zip+0.044
pack+0.043
nob+0.042
Neg logit contributions
 oh-0.044
Run-0.040
 dear-0.040
 Please-0.039
 Uncle-0.038
Token his
Token ID465
Feature activation-1.88e+00

Pos logit contributions
ws+0.010
pack+0.009
ips+0.009
zip+0.009
ics+0.009
Neg logit contributions
 oh-0.006
 Uncle-0.006
 dear-0.005
 Farmer-0.005
Run-0.005
Token yellow
Token ID7872
Feature activation-2.87e+00

Pos logit contributions
 ease+0.041
ws+0.040
zip+0.038
ips+0.038
ge+0.037
Neg logit contributions
 oh-0.047
 dear-0.045
 Uncle-0.042
 Please-0.042
 bye-0.041
Token and
Token ID290
Feature activation+2.01e+00

Pos logit contributions
 oh+0.020
 dear+0.019
Run+0.018
 Please+0.018
 gust+0.017
Neg logit contributions
 ease-0.022
ws-0.021
zip-0.020
ips-0.020
nob-0.020
Token they
Token ID484
Feature activation-4.65e-01

Pos logit contributions
 oh+0.039
 dear+0.037
Run+0.036
 Please+0.035
 gust+0.033
Neg logit contributions
 ease-0.053
ws-0.050
zip-0.049
 rebuild-0.048
 neatly-0.048
Token applauded
Token ID38911
Feature activation+1.33e+00

Pos logit contributions
ws+0.010
 surfaces+0.010
 ease+0.010
nob+0.010
ips+0.010
Neg logit contributions
 oh-0.011
 dear-0.010
 bye-0.010
 Please-0.010
Run-0.010
Token him
Token ID683
Feature activation+3.95e-01

Pos logit contributions
 oh+0.035
 dear+0.034
Run+0.033
 Please+0.032
 gust+0.031
Neg logit contributions
 ease-0.027
ws-0.024
ips-0.023
 rebuild-0.023
 surfaces-0.023
Token too
Token ID1165
Feature activation+2.56e+00

Pos logit contributions
 oh+0.008
 dear+0.008
 Please+0.007
 bye+0.007
 Uncle+0.007
Neg logit contributions
 ease-0.010
ws-0.010
zip-0.010
ips-0.009
 surfaces-0.009
Token.
Token ID13
Feature activation-1.07e+00

Pos logit contributions
 oh+0.067
 dear+0.062
Run+0.060
 Please+0.059
 gust+0.057
Neg logit contributions
 ease-0.052
ws-0.048
ips-0.046
ge-0.046
 neatly-0.046
Token It
Token ID632
Feature activation-9.97e-01

Pos logit contributions
 ease+0.027
ws+0.026
zip+0.025
ips+0.025
 surfaces+0.024
Neg logit contributions
 oh-0.023
 dear-0.022
Run-0.020
 Please-0.020
 gust-0.019
Token was
Token ID373
Feature activation+1.60e+00

Pos logit contributions
 ease+0.023
ws+0.023
ips+0.022
zip+0.021
 surfaces+0.021
Neg logit contributions
 oh-0.023
 dear-0.022
 Please-0.021
Run-0.021
 Uncle-0.020
Token a
Token ID257
Feature activation+1.87e+00

Pos logit contributions
 oh+0.038
 dear+0.037
Run+0.036
 Please+0.035
 bye+0.034
Neg logit contributions
 ease-0.036
ws-0.033
zip-0.031
 neatly-0.031
nob-0.031
Token great
Token ID1049
Feature activation+6.33e-01

Pos logit contributions
 oh+0.046
Run+0.043
 dear+0.042
 Please+0.041
 Uncle+0.039
Neg logit contributions
 ease-0.039
 stairs-0.036
 driveway-0.035
 ramp-0.035
zip-0.034
Token day
Token ID1110
Feature activation+2.00e+00

Pos logit contributions
 oh+0.017
 dear+0.016
Run+0.016
 Please+0.015
 Uncle+0.014
Neg logit contributions
 ease-0.013
ws-0.011
 stairs-0.011
pack-0.011
 surfaces-0.011
Token,
Token ID11
Feature activation+4.31e+00

Pos logit contributions
Run+0.056
 oh+0.052
 gru+0.050
 Please+0.048
 dear+0.047
Neg logit contributions
 ease-0.075
ws-0.075
zip-0.072
ips-0.071
ics-0.070
Token car
Token ID1097
Feature activation+3.91e-01

Pos logit contributions
 oh+0.087
Run+0.086
 dear+0.078
 gru+0.078
 Please+0.077
Neg logit contributions
 ease-0.103
ws-0.098
 stairs-0.097
zip-0.094
nob-0.094
Token.
Token ID13
Feature activation+2.63e+00

Pos logit contributions
 oh+0.008
 dear+0.007
Run+0.007
 Please+0.007
 Uncle+0.007
Neg logit contributions
 ease-0.010
ws-0.010
zip-0.009
ips-0.009
pack-0.009
Token I
Token ID314
Feature activation-1.95e+00

Pos logit contributions
 oh+0.047
 gust+0.042
Run+0.041
 gru+0.039
 dear+0.039
Neg logit contributions
zip-0.071
 ease-0.069
ws-0.068
 stairs-0.066
ips-0.066
Token drew
Token ID9859
Feature activation-5.28e-01

Pos logit contributions
 ease+0.046
 surfaces+0.044
ics+0.042
ips+0.042
 stairs+0.042
Neg logit contributions
 oh-0.044
 dear-0.042
 Please-0.042
Run-0.040
 Can-0.038
Token a
Token ID257
Feature activation-1.64e-02

Pos logit contributions
 ease+0.013
ws+0.013
zip+0.012
ips+0.012
pack+0.012
Neg logit contributions
 oh-0.011
 dear-0.011
 Please-0.010
Run-0.010
 bye-0.010
Token pink
Token ID11398
Feature activation-1.45e+00

Pos logit contributions
 ease+0.000
ws+0.000
ips+0.000
zip+0.000
nob+0.000
Neg logit contributions
 oh-0.000
 dear-0.000
 Please-0.000
 bye-0.000
 Uncle-0.000
Token flower
Token ID15061
Feature activation-1.99e-01

Pos logit contributions
 ease+0.029
ws+0.028
zip+0.027
ips+0.026
nob+0.026
Neg logit contributions
 oh-0.038
 dear-0.036
Run-0.034
 Please-0.034
 Uncle-0.033
Token.
Token ID13
Feature activation+2.00e+00

Pos logit contributions
 ease+0.005
ws+0.005
zip+0.005
 surfaces+0.005
ips+0.005
Neg logit contributions
 oh-0.004
 dear-0.004
 Please-0.004
Run-0.004
 bye-0.004
Token Isn
Token ID25110
Feature activation-2.02e+00

Pos logit contributions
 oh+0.038
Run+0.034
 gust+0.033
 dear+0.031
 goat+0.031
Neg logit contributions
zip-0.052
ws-0.051
 ease-0.051
nob-0.049
 chooses-0.048
Token't
Token ID470
Feature activation+1.58e+00

Pos logit contributions
ws+0.039
 ease+0.037
 surfaces+0.036
 bank+0.036
 driveway+0.035
Neg logit contributions
 oh-0.055
 dear-0.053
 Please-0.052
 gust-0.050
Run-0.049
Token They
Token ID1119
Feature activation+3.01e-01

Pos logit contributions
 ease+0.021
ws+0.021
zip+0.020
ips+0.020
pack+0.020
Neg logit contributions
 oh-0.016
 dear-0.015
 Please-0.014
Run-0.014
 Uncle-0.014
Token saw
Token ID2497
Feature activation+1.10e+00

Pos logit contributions
 oh+0.006
 dear+0.006
 Please+0.005
Run+0.005
 Uncle+0.005
Neg logit contributions
 ease-0.008
ws-0.008
zip-0.007
ips-0.007
 surfaces-0.007
Token a
Token ID257
Feature activation+2.15e+00

Pos logit contributions
 oh+0.026
 Please+0.026
 dear+0.026
Run+0.024
 bye+0.023
Neg logit contributions
 ease-0.024
ws-0.022
 stairs-0.022
 tops-0.021
zip-0.021
Token big
Token ID1263
Feature activation+1.63e+00

Pos logit contributions
 oh+0.059
 dear+0.056
 Please+0.054
Run+0.053
 bye+0.052
Neg logit contributions
 stairs-0.037
 ease-0.037
 ramp-0.036
zip-0.035
 driveway-0.035
Token metal
Token ID6147
Feature activation+2.14e-01

Pos logit contributions
 oh+0.051
 dear+0.049
 Please+0.047
Run+0.046
 importantly+0.045
Neg logit contributions
 stairs-0.023
 ramp-0.023
 bank-0.021
 driveway-0.021
 blocks-0.021
Token thing
Token ID1517
Feature activation+3.54e-01

Pos logit contributions
 oh+0.006
 dear+0.006
 Please+0.006
Run+0.006
 Uncle+0.006
Neg logit contributions
 ease-0.003
ws-0.003
 stairs-0.003
 bank-0.003
zip-0.003
Token with
Token ID351
Feature activation-1.15e+00

Pos logit contributions
 oh+0.009
 dear+0.008
 Please+0.008
Run+0.008
 bye+0.008
Neg logit contributions
 ease-0.007
ws-0.007
 stairs-0.007
nob-0.007
ips-0.007
Token wheels
Token ID13666
Feature activation-1.43e+00

Pos logit contributions
 ease+0.025
ws+0.025
ips+0.024
zip+0.024
pack+0.023
Neg logit contributions
 oh-0.027
 dear-0.026
Run-0.024
 Please-0.024
 Uncle-0.024
Token and
Token ID290
Feature activation+4.91e-02

Pos logit contributions
 ease+0.041
ws+0.040
zip+0.039
nob+0.038
 surfaces+0.038
Neg logit contributions
 oh-0.023
 dear-0.023
 Please-0.020
 hello-0.020
Run-0.020
Token a
Token ID257
Feature activation-3.44e-01

Pos logit contributions
 oh+0.001
 dear+0.001
Run+0.001
 Please+0.001
 Uncle+0.001
Neg logit contributions
 ease-0.001
ws-0.001
ips-0.001
zip-0.001
pack-0.001
Token horn
Token ID12718
Feature activation-8.68e-01

Pos logit contributions
 ease+0.007
ws+0.007
ips+0.007
zip+0.007
nob+0.006
Neg logit contributions
 oh-0.009
 dear-0.008
 Please-0.008
Run-0.008
 Uncle-0.008
Token and
Token ID290
Feature activation-2.10e+00

Pos logit contributions
 ease+0.054
zip+0.049
ws+0.049
irs+0.048
ips+0.046
Neg logit contributions
 oh-0.021
 dear-0.021
 Uncle-0.019
 sorry-0.019
 Please-0.018
Token help
Token ID1037
Feature activation-2.05e+00

Pos logit contributions
 ease+0.044
ips+0.041
ge+0.041
ws+0.040
eds+0.039
Neg logit contributions
 oh-0.055
 dear-0.051
 Please-0.050
 Uncle-0.050
 sorry-0.050
Token them
Token ID606
Feature activation-2.60e+00

Pos logit contributions
 ease+0.047
ws+0.046
ips+0.045
zag+0.045
 ramp+0.044
Neg logit contributions
 oh-0.050
 Uncle-0.045
 dear-0.044
 Please-0.042
 bye-0.042
Token wash
Token ID13502
Feature activation-2.87e+00

Pos logit contributions
 ease+0.065
ips+0.062
aries+0.060
eds+0.059
orns+0.059
Neg logit contributions
 oh-0.056
 sorry-0.055
 dear-0.054
 curious-0.049
 hello-0.049
Token.
Token ID13
Feature activation+4.76e-01

Pos logit contributions
 ease+0.065
ws+0.061
zip+0.060
 driveway+0.059
ips+0.059
Neg logit contributions
 oh-0.069
 dear-0.064
 Uncle-0.061
 Please-0.060
Run-0.059
Token They
Token ID1119
Feature activation-3.34e+00

Pos logit contributions
 oh+0.013
 dear+0.012
Run+0.011
 gust+0.011
 Please+0.011
Neg logit contributions
 ease-0.010
ws-0.009
zip-0.009
ips-0.009
 stairs-0.009
Token give
Token ID1577
Feature activation-1.67e+00

Pos logit contributions
eds+0.084
ips+0.082
ge+0.082
 ease+0.082
zag+0.080
Neg logit contributions
 dear-0.067
 sorry-0.065
 Uncle-0.061
 bye-0.060
Run-0.059
Token them
Token ID606
Feature activation-1.26e+00

Pos logit contributions
 ease+0.027
ws+0.026
ge+0.025
zip+0.025
nob+0.025
Neg logit contributions
 oh-0.051
 dear-0.048
 Uncle-0.046
 Please-0.045
Run-0.045
Token towels
Token ID38494
Feature activation-3.23e+00

Pos logit contributions
 ease+0.028
ws+0.026
zip+0.025
ips+0.025
ge+0.025
Neg logit contributions
 oh-0.031
 dear-0.029
 Please-0.027
 Uncle-0.027
Run-0.027
Token and
Token ID290
Feature activation-1.08e+00

Pos logit contributions
 ease+0.093
ws+0.089
zip+0.086
ips+0.085
 surfaces+0.085
Neg logit contributions
 oh-0.056
 dear-0.054
 Uncle-0.049
 Please-0.049
 bye-0.048
Token p
Token ID279
Feature activation-3.16e+00

Pos logit contributions
 ease+0.019
ws+0.018
zip+0.017
ips+0.017
 surfaces+0.016
Neg logit contributions
 oh-0.031
 dear-0.030
 Please-0.029
Run-0.028
 Uncle-0.028
Token clever
Token ID14169
Feature activation+1.32e+00

Pos logit contributions
 ease+0.021
ws+0.020
zip+0.019
nob+0.019
ips+0.019
Neg logit contributions
 oh-0.023
 dear-0.022
 Please-0.021
Run-0.021
 Uncle-0.020
Token before
Token ID878
Feature activation+2.50e+00

Pos logit contributions
 oh+0.033
Run+0.031
 dear+0.031
 Please+0.030
 sorry+0.029
Neg logit contributions
 ease-0.027
nob-0.024
ws-0.024
 bank-0.023
 surfaces-0.023
Token.
Token ID13
Feature activation-5.32e-01

Pos logit contributions
 oh+0.060
 dear+0.057
Run+0.054
 Please+0.054
 bye+0.052
Neg logit contributions
 ease-0.056
ws-0.052
ips-0.050
nob-0.050
zip-0.050
Token He
Token ID679
Feature activation-3.63e-02

Pos logit contributions
 ease+0.013
ws+0.013
zip+0.012
nob+0.012
 surfaces+0.012
Neg logit contributions
 oh-0.012
 dear-0.011
Run-0.010
 Please-0.010
 gust-0.010
Token wanted
Token ID2227
Feature activation-2.55e-01

Pos logit contributions
 ease+0.001
ws+0.001
ips+0.001
zip+0.001
 surfaces+0.001
Neg logit contributions
 oh-0.001
 dear-0.001
 Please-0.001
Run-0.001
 Uncle-0.001
Token to
Token ID284
Feature activation-3.83e+00

Pos logit contributions
 ease+0.006
ws+0.006
zip+0.005
ips+0.005
 surfaces+0.005
Neg logit contributions
 oh-0.006
 dear-0.006
 Please-0.005
 Uncle-0.005
Run-0.005
Token watch
Token ID2342
Feature activation+2.27e-02

Pos logit contributions
ips+0.087
ws+0.083
 channels+0.079
oons+0.078
 surfaces+0.077
Neg logit contributions
 oh-0.092
 dear-0.090
 hello-0.088
 Please-0.086
 bye-0.085
Token the
Token ID262
Feature activation-4.59e-01

Pos logit contributions
 oh+0.001
 dear+0.001
Run+0.000
 Please+0.000
 Uncle+0.000
Neg logit contributions
 ease-0.001
ws-0.000
zip-0.000
ips-0.000
nob-0.000
Token bird
Token ID6512
Feature activation-1.44e+00

Pos logit contributions
 ease+0.014
ws+0.013
zip+0.013
ips+0.013
pack+0.013
Neg logit contributions
 oh-0.008
 dear-0.007
Run-0.007
 Please-0.007
 bye-0.006
Token as
Token ID355
Feature activation+3.23e+00

Pos logit contributions
ws+0.037
 ease+0.036
ips+0.036
zip+0.036
pack+0.035
Neg logit contributions
 oh-0.029
 dear-0.028
 hello-0.026
 sorry-0.025
 bye-0.025
Token it
Token ID340
Feature activation-2.93e+00

Pos logit contributions
 oh+0.071
 dear+0.067
 Please+0.064
Run+0.063
 bye+0.060
Neg logit contributions
 ease-0.081
ws-0.074
zip-0.072
ge-0.071
 stairs-0.070
Token saw
Token ID2497
Feature activation+1.19e+00

Pos logit contributions
 oh+0.023
 dear+0.022
 Uncle+0.021
Run+0.019
 bye+0.019
Neg logit contributions
nob-0.043
ws-0.043
 ease-0.042
 bank-0.042
 build-0.041
Token the
Token ID262
Feature activation-1.13e+00

Pos logit contributions
 oh+0.031
 dear+0.030
 Please+0.029
Run+0.028
 gust+0.027
Neg logit contributions
 ease-0.024
ws-0.023
zip-0.021
ips-0.021
nob-0.021
Token bird
Token ID6512
Feature activation-1.45e+00

Pos logit contributions
 ease+0.032
ws+0.032
zip+0.031
ips+0.030
nob+0.030
Neg logit contributions
 oh-0.021
 dear-0.019
 Please-0.018
Run-0.017
 Uncle-0.017
Token and
Token ID290
Feature activation+9.04e-01

Pos logit contributions
 ease+0.044
ws+0.043
zip+0.042
ips+0.042
 surfaces+0.041
Neg logit contributions
 oh-0.022
 dear-0.021
 sorry-0.020
 hello-0.019
Run-0.019
Token wanted
Token ID2227
Feature activation-6.81e-01

Pos logit contributions
 oh+0.018
 dear+0.017
 Please+0.016
 Uncle+0.016
Run+0.016
Neg logit contributions
 ease-0.022
ws-0.022
 bank-0.021
 stairs-0.021
zip-0.021
Token to
Token ID284
Feature activation-3.81e+00

Pos logit contributions
 ease+0.016
ws+0.016
ips+0.016
zip+0.015
pack+0.015
Neg logit contributions
 oh-0.016
 dear-0.014
 Please-0.014
 Uncle-0.014
 hello-0.013
Token help
Token ID1037
Feature activation-1.60e+00

Pos logit contributions
ips+0.102
ws+0.095
 channels+0.095
ge+0.094
 ease+0.093
Neg logit contributions
 dear-0.076
 oh-0.076
 Please-0.069
 hello-0.069
 bye-0.069
Token.
Token ID13
Feature activation-6.58e-01

Pos logit contributions
 ease+0.039
ws+0.038
ips+0.037
zip+0.036
pack+0.036
Neg logit contributions
 oh-0.036
 dear-0.033
 Please-0.032
 Uncle-0.031
Run-0.031
Token
Token ID198
Feature activation-2.53e+00

Pos logit contributions
 ease+0.012
zip+0.011
ws+0.011
 stairs+0.011
 bank+0.010
Neg logit contributions
 oh-0.019
 dear-0.018
Run-0.017
 gust-0.016
 Please-0.016
Token
Token ID198
Feature activation-3.39e-01

Pos logit contributions
 ease+0.063
ws+0.062
 surfaces+0.060
ips+0.059
 bank+0.058
Neg logit contributions
 oh-0.052
 dear-0.049
 Please-0.048
Run-0.048
 Uncle-0.048
Token"
Token ID1
Feature activation+3.79e-01

Pos logit contributions
zip+0.005
 ease+0.005
nob+0.005
ws+0.005
pack+0.005
Neg logit contributions
 oh-0.010
 dear-0.010
 sorry-0.009
 Uncle-0.009
 Please-0.009
Token wanted
Token ID2227
Feature activation-6.43e-01

Pos logit contributions
 ease+0.001
ws+0.001
zip+0.001
nob+0.001
 bank+0.001
Neg logit contributions
 oh-0.001
 dear-0.001
 Please-0.001
 Uncle-0.001
Run-0.001
Token to
Token ID284
Feature activation-3.48e+00

Pos logit contributions
ws+0.016
 ease+0.016
ips+0.016
zip+0.016
ge+0.015
Neg logit contributions
 oh-0.014
 Uncle-0.012
 dear-0.012
 Please-0.011
 sorry-0.011
Token escape
Token ID6654
Feature activation-1.06e+00

Pos logit contributions
ips+0.075
ws+0.073
 channels+0.068
ge+0.067
ushes+0.067
Neg logit contributions
 oh-0.090
 dear-0.084
 sorry-0.081
Run-0.080
 bye-0.080
Token but
Token ID475
Feature activation+5.54e-01

Pos logit contributions
 ease+0.027
ws+0.026
zip+0.025
ips+0.025
 surfaces+0.024
Neg logit contributions
 oh-0.022
 dear-0.021
 Please-0.020
Run-0.020
 Uncle-0.019
Token he
Token ID339
Feature activation-2.99e-01

Pos logit contributions
 dear+0.013
 oh+0.012
Run+0.012
 Please+0.012
 goat+0.011
Neg logit contributions
 ease-0.013
 neatly-0.011
zip-0.011
ws-0.011
 build-0.011
Token couldn
Token ID3521
Feature activation-3.54e+00

Pos logit contributions
 ease+0.007
ws+0.006
nob+0.006
zip+0.006
sey+0.006
Neg logit contributions
 dear-0.007
 oh-0.007
 Please-0.006
Run-0.006
 hello-0.006
Token't
Token ID470
Feature activation-1.99e+00

Pos logit contributions
ws+0.080
 ease+0.073
 surfaces+0.071
 outfits+0.070
wed+0.068
Neg logit contributions
 oh-0.085
 Please-0.084
Run-0.083
 sorry-0.079
 gust-0.078
Token move
Token ID1445
Feature activation-1.08e+00

Pos logit contributions
 ease+0.046
ws+0.045
ips+0.044
 channels+0.043
zip+0.042
Neg logit contributions
 oh-0.046
 dear-0.042
Run-0.042
 Please-0.041
 bye-0.040
Token.
Token ID13
Feature activation-9.94e-01

Pos logit contributions
 ease+0.026
ws+0.026
zip+0.024
ips+0.024
pack+0.024
Neg logit contributions
 oh-0.024
 dear-0.022
 Please-0.021
Run-0.021
 sorry-0.021
Token Joe
Token ID5689
Feature activation+1.24e+00

Pos logit contributions
 ease+0.020
ws+0.019
zip+0.018
ips+0.018
nob+0.018
Neg logit contributions
 oh-0.026
 dear-0.025
 Please-0.024
Run-0.024
 bye-0.023
Token started
Token ID2067
Feature activation-2.05e-01

Pos logit contributions
 oh+0.031
 dear+0.031
 Please+0.028
Run+0.028
 Uncle+0.028
Neg logit contributions
 ease-0.025
ws-0.024
 bank-0.023
zip-0.023
nob-0.023
Token.
Token ID13
Feature activation-3.86e-01

Pos logit contributions
 oh+0.001
 dear+0.001
 Please+0.001
Run+0.001
 Uncle+0.001
Neg logit contributions
 ease-0.001
ws-0.001
zip-0.001
ips-0.001
pack-0.001
Token They
Token ID1119
Feature activation-6.88e-01

Pos logit contributions
 ease+0.008
ws+0.007
zip+0.007
ips+0.007
nob+0.007
Neg logit contributions
 oh-0.010
 dear-0.010
Run-0.009
 Please-0.009
 bye-0.009
Token couldn
Token ID3521
Feature activation-3.99e+00

Pos logit contributions
ips+0.017
ws+0.017
zip+0.016
 surfaces+0.016
nob+0.016
Neg logit contributions
 oh-0.014
 dear-0.013
Run-0.013
 Uncle-0.012
 Please-0.012
Token't
Token ID470
Feature activation-2.81e+00

Pos logit contributions
ws+0.097
 surfaces+0.087
 ease+0.086
wed+0.085
zip+0.084
Neg logit contributions
 Please-0.086
 oh-0.085
Run-0.084
 sorry-0.081
 Uncle-0.080
Token wait
Token ID4043
Feature activation+2.56e-01

Pos logit contributions
ips+0.078
ws+0.077
zip+0.075
 ease+0.075
ics+0.073
Neg logit contributions
 oh-0.052
 dear-0.048
Run-0.045
 Please-0.045
 sorry-0.042
Token to
Token ID284
Feature activation-4.09e+00

Pos logit contributions
 oh+0.006
 dear+0.006
 Please+0.006
Run+0.006
 Uncle+0.005
Neg logit contributions
 ease-0.006
ws-0.005
zip-0.005
ips-0.005
 surfaces-0.005
Token show
Token ID905
Feature activation-6.02e-01

Pos logit contributions
ips+0.096
oons+0.088
ws+0.087
 channels+0.083
ushes+0.083
Neg logit contributions
 Harry-0.092
 dear-0.091
 oh-0.091
 Uncle-0.090
 hello-0.089
Token it
Token ID340
Feature activation-1.46e+00

Pos logit contributions
 ease+0.013
ws+0.012
zip+0.012
ips+0.011
 surfaces+0.011
Neg logit contributions
 oh-0.015
 dear-0.015
 Please-0.014
Run-0.014
 Uncle-0.013
Token to
Token ID284
Feature activation-7.21e-02

Pos logit contributions
ws+0.032
zip+0.032
ips+0.032
 ease+0.032
 driveway+0.031
Neg logit contributions
 oh-0.034
 dear-0.032
 hello-0.032
 bye-0.030
 sorry-0.030
Token their
Token ID511
Feature activation-9.91e-02

Pos logit contributions
 ease+0.002
ws+0.002
zip+0.002
ips+0.002
nob+0.002
Neg logit contributions
 oh-0.002
 dear-0.002
 Please-0.001
Run-0.001
 Uncle-0.001
Token mum
Token ID25682
Feature activation+2.17e+00

Pos logit contributions
 ease+0.003
ws+0.002
zip+0.002
ips+0.002
nob+0.002
Neg logit contributions
 oh-0.002
 dear-0.002
 Please-0.002
 Uncle-0.002
 bye-0.002