INDEX
Explanations
numerical percentages or statistics
New Auto-Interp
Negative Logits
pecially
-0.83
etheless
-0.81
compan
-0.78
neighb
-0.76
itiz
-0.74
brace
-0.74
uras
-0.74
shaped
-0.74
compr
-0.73
lett
-0.72
POSITIVE LOGITS
-+
0.90
%)
0.78
%
0.76
%-
0.74
+)
0.72
Mehran
0.72
âĨij
0.71
ABV
0.69
-|
0.68
Ibid
0.68
Activations Density 0.049%