INDEX
Explanations
phrases and descriptors that indicate typicality or common examples
New Auto-Interp
Negative Logits
Forst
-0.61
BorderFactory
-0.53
ссер
-0.53
зда
-0.52
eventbus
-0.49
Horne
-0.49
openConnection
-0.49
airbnb
-0.48
Seif
-0.48
raffredd
-0.47
POSITIVE LOGITS
Typical
1.30
typical
1.27
typical
1.24
Typical
1.23
typique
1.04
典型
0.96
típico
0.90
atypical
0.90
TYP
0.88
typically
0.88
Activations Density 0.169%