INDEX
Explanations
phrases expressing opinions, beliefs, or general statements
negations or phrases indicating what is not necessary or required
New Auto-Interp
Negative Logits
é¾įå¥ij士
-0.67
integrity
-0.62
Forums
-0.62
resumes
-0.61
Adin
-0.60
homosexuality
-0.58
sincerity
-0.56
affinity
-0.56
MpServer
-0.56
pires
-0.56
POSITIVE LOGITS
yourselves
0.95
yourself
0.94
gotta
0.87
Tube
0.86
need
0.85
hear
0.82
plin
0.81
expect
0.78
uld
0.78
realise
0.77
Activations Density 0.176%