INDEX
Explanations
phrases indicating recognition or identification of subjects
New Auto-Interp
Negative Logits
s
-0.62
<h3>
-0.61
y
-0.60
AuthProvider
-0.60
Phillips
-0.59
גרת
-0.58
確認ください
-0.58
pac
-0.57
äť
-0.56
Raiders
-0.56
POSITIVE LOGITS
KNOWN
1.16
KNOWN
1.16
Known
1.15
known
1.11
Known
1.10
known
1.07
complexContent
0.97
Portale
0.97
Bekannt
0.96
>=",
0.94
Activations Density 0.088%