INDEX
Explanations
mentions related to organizations or entities
New Auto-Interp
Negative Logits
oneself
-0.64
pire
-0.64
guiName
-0.63
Gur
-0.59
Guan
-0.59
ostic
-0.58
Mayweather
-0.58
////
-0.58
furt
-0.58
Detected
-0.58
POSITIVE LOGITS
mates
1.76
mate
1.48
mates
1.45
mate
1.11
counterparts
0.98
colleague
0.92
brethren
0.87
leader
0.87
men
0.86
colleagues
0.81
Activations Density 0.200%