INDEX
Explanations
references to opinions or statements made by different individuals
phrases indicating possession or association
New Auto-Interp
Negative Logits
Champ
-0.69
MX
-0.65
<?
-0.62
ocene
-0.60
Written
-0.58
CHAT
-0.56
////////////////////////////////
-0.56
sew
-0.56
kat
-0.55
ãĤ¼ãĤ¦ãĤ¹
-0.54
POSITIVE LOGITS
course
0.85
sorts
0.81
course
0.71
¥µ
0.71
elia
0.65
una
0.65
ilities
0.62
those
0.61
edly
0.61
the
0.61
Activations Density 0.057%