INDEX
Explanations
phrases indicating value judgment or importance
expressions emphasizing the importance or value of something
New Auto-Interp
Negative Logits
Oops
-0.66
Vehicles
-0.66
gdala
-0.64
Nou
-0.64
Sierra
-0.63
Dogs
-0.62
Equipment
-0.62
Breast
-0.62
Underground
-0.61
baugh
-0.60
POSITIVE LOGITS
folios
0.88
iness
0.86
orth
0.86
ily
0.85
otine
0.77
olulu
0.77
consideration
0.77
careful
0.75
ensing
0.74
entin
0.74
Activations Density 0.016%