INDEX
Explanations
connections and relationships among like-minded individuals
similarity and comparison
New Auto-Interp
Negative Logits
claims
-0.33
claims
-0.33
defendant
-0.33
claim
-0.32
reclama
-0.31
claim
-0.31
the
-0.31
sug
-0.31
ApiProperty
-0.30
suger
-0.30
POSITIVE LOGITS
WriteBarrier
0.70
الحره
0.68
évaluateur
0.59
transférez
0.58
autorytatywna
0.57
kindred
0.55
BASELINE
0.55
Similar
0.55
RectangleBorder
0.55
Similar
0.53
Activations Density 0.018%