INDEX
Explanations
the word "operations" and sometimes words associated with managing temperature of an object
operations
New Auto-Interp
Negative Logits
-0.73
1
-0.72
2
-0.71
the
-0.67
5
-0.65
-0.61
0
-0.59
3
-0.59
6
-0.58
↵↵
-0.58
POSITIVE LOGITS
purpoſe
1.47
ſtate
1.43
pleaſure
1.37
poffe
1.35
itſelf
1.30
juſt
1.26
reaſon
1.25
ainfi
1.24
themſelves
1.24
himſelf
1.24
Activations Density 0.537%