INDEX
Explanations
personal anecdotes and storytelling in informal language
New Auto-Interp
Negative Logits
solitary
-0.60
inals
-0.57
prost
-0.54
urg
-0.53
©¶æ
-0.53
prus
-0.52
quadru
-0.51
adr
-0.49
brig
-0.48
driving
-0.47
POSITIVE LOGITS
Anyway
0.74
Anyway
0.71
anyways
0.71
ado
0.67
adays
0.65
ipop
0.64
gotta
0.64
ymes
0.63
ional
0.62
lets
0.61
Activations Density 11.035%