INDEX
Explanations
names of specific entities, such as people, places, and organizations
names of media, locations, and brands
New Auto-Interp
Negative Logits
="#
-0.70
ËĪ
-0.68
76561
-0.63
ileaks
-0.61
HAS
-0.61
ãĥ£
-0.60
ãĥ¼ãĥĨãĤ£
-0.59
endum
-0.58
":"/
-0.56
tains
-0.56
POSITIVE LOGITS
respectively
1.78
alike
1.12
Interstitial
0.93
+.
0.88
;
0.81
.
0.80
*.
0.80
Aven
0.80
etc
0.77
combined
0.76
Activations Density 0.411%