INDEX
Explanations
phrases expressing extreme certainty or negation
occurrences of the phrase "at all."
New Auto-Interp
Negative Logits
pu
-0.65
Chancellor
-0.62
Patriarch
-0.61
uay
-0.58
lict
-0.57
Koen
-0.57
proverb
-0.57
lav
-0.56
chancellor
-0.55
lf
-0.55
POSITIVE LOGITS
ocating
0.89
ãĥīãĥ©ãĤ´ãĥ³
0.81
onse
0.78
else
0.76
onge
0.74
except
0.73
levels
0.72
ones
0.72
å¸
0.71
times
0.71
Activations Density 0.024%