INDEX
Explanations
expressions related to personal beliefs or experiences
personal reflections on life experiences
New Auto-Interp
Negative Logits
ãĥĥãĥĪ
-0.65
viks
-0.64
adelphia
-0.62
ackers
-0.62
ãĥī
-0.62
Officials
-0.62
Pwr
-0.61
ItemTracker
-0.61
ranch
-0.60
©¶æ¥µ
-0.59
POSITIVE LOGITS
my
2.06
myself
1.68
I
1.66
me
1.43
MY
1.36
mine
1.34
my
1.34
My
1.33
My
1.32
I
1.21
Activations Density 1.128%