INDEX
Explanations
words related to mindsets and attitudes
references to different types of mentalities or mindsets
New Auto-Interp
Negative Logits
tein
-0.76
cutting
-0.75
icles
-0.75
enegger
-0.73
weather
-0.70
lev
-0.70
gotten
-0.69
arers
-0.68
cuts
-0.68
selling
-0.67
POSITIVE LOGITS
mentality
1.17
mindset
1.02
yip
0.85
attitude
0.81
ãħĭãħĭ
0.69
attRot
0.68
arrogance
0.67
istical
0.65
inconsistency
0.65
achine
0.65
Activations Density 0.014%