INDEX
Explanations
phrases related to urging or advising others to do or not do something
negative imperatives or expressions indicating prohibition
New Auto-Interp
Negative Logits
catentry
-0.67
Judge
-0.66
Fourth
-0.65
rouse
-0.60
Analysis
-0.60
utherford
-0.60
ItemThumbnailImage
-0.59
Higher
-0.58
ullah
-0.58
METHOD
-0.57
POSITIVE LOGITS
ï¸ı
0.73
kidding
0.72
=""
0.69
âĢº
0.64
ðŁij
0.64
(@
0.63
ymes
0.63
ĸļ
0.62
ðŁ
0.60
FANTASY
0.60
Activations Density 0.645%