INDEX
Explanations
references to the concept of "OK" or its variations in different contexts
New Auto-Interp
Negative Logits
purpoſe
-0.94
neceff
-0.91
reaſon
-0.90
ſtate
-0.87
myſelf
-0.86
juſ
-0.85
poffible
-0.85
Monfieur
-0.83
pleaſure
-0.83
fubject
-0.82
POSITIVE LOGITS
shorter
1.05
smaller
1.01
lesser
0.81
smaller
0.72
Smaller
0.70
lower
0.66
bag
0.65
Smaller
0.64
pat
0.61
Shorter
0.58
Activations Density 0.149%