INDEX
Explanations
articles (a, an, the)
articles, specifically the words "a" and "an."
New Auto-Interp
Negative Logits
UD
-0.74
UDP
-0.67
Provision
-0.63
PID
-0.62
ank
-0.61
Nemesis
-0.60
gear
-0.60
Submission
-0.60
Sadd
-0.60
braking
-0.58
POSITIVE LOGITS
ocratic
0.98
emic
0.91
esthetic
0.91
ms
0.90
ria
0.90
alian
0.87
wn
0.87
omorph
0.86
ocracy
0.86
oca
0.83
Activations Density 0.142%