INDEX
Explanations
nuances of existence and conditionality in statements
New Auto-Interp
Negative Logits
bebasan
-0.60
termica
-0.55
particulière
-0.53
telles
-0.53
sibilità
-0.51
výbě
-0.50
väx
-0.50
rosse
-0.50
gué
-0.50
antichi
-0.49
POSITIVE LOGITS
"])
0.87
PreferredItem
0.84
very
0.81
"):
0.81
)");
0.81
"""
0.78
highly
0.76
quite
0.74
extAlignment
0.74
"],
0.74
Activations Density 0.549%