INDEX
Explanations
phrases related to personal thoughts or narration
expressions of personal identity and status
New Auto-Interp
Negative Logits
accompanies
-0.74
bearer
-0.72
bda
-0.70
IMAGES
-0.68
narration
-0.67
autop
-0.64
edly
-0.63
cryptoc
-0.63
charism
-0.61
thumbnail
-0.60
POSITIVE LOGITS
gonna
1.02
[/
0.89
âĢ
0.87
!!"
0.81
Pak
0.81
""
0.78
fuckin
0.75
!'
0.74
"""
0.74
»
0.71
Activations Density 0.137%