INDEX
Explanations
degree of abstract qualities
New Auto-Interp
Negative Logits
Of
0.55
Of
0.46
OF
0.38
AlignedText
0.38
Primarily
0.37
Sebuah
0.37
of
0.37
一個
0.36
suatu
0.36
excessively
0.36
POSITIVE LOGITS
scepticism
0.43
apprehension
0.38
skepticism
0.37
pandémie
0.37
terv
0.35
vorschau
0.35
тог
0.35
obliqu
0.35
structural
0.35
apre
0.34
Activations Density 0.032%