INDEX
Explanations
words and phrases used to emphasize a point or opinion
thing/reason/purpose
New Auto-Interp
Negative Logits
с
-0.35
where
-0.35
An
-0.34
まま
-0.34
sz
-0.33
An
-0.32
vi
-0.32
***!
-0.32
tils
-0.32
roll
-0.32
POSITIVE LOGITS
thing
1.13
goal
1.02
itſelf
1.00
purpoſe
0.96
Jefus
0.96
consequence
0.96
reaſon
0.95
myſelf
0.93
reason
0.92
objective
0.91
Activations Density 1.433%