Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: mime: expand on what is covered by builtinTypes #69530

Open
AidanWelch opened this issue Sep 19, 2024 · 12 comments
Open

proposal: mime: expand on what is covered by builtinTypes #69530

AidanWelch opened this issue Sep 19, 2024 · 12 comments
Labels
Milestone

Comments

@AidanWelch
Copy link

Proposal Details

Right now,

mime/type.go includes what seems to be a somewhat arbitrary list of built-in types:

var builtinTypesLower = map[string]string{
	".avif": "image/avif",
	".css":  "text/css; charset=utf-8",
	".gif":  "image/gif",
	".htm":  "text/html; charset=utf-8",
	".html": "text/html; charset=utf-8",
	".jpeg": "image/jpeg",
	".jpg":  "image/jpeg",
	".js":   "text/javascript; charset=utf-8",
	".json": "application/json",
	".mjs":  "text/javascript; charset=utf-8",
	".pdf":  "application/pdf",
	".png":  "image/png",
	".svg":  "image/svg+xml",
	".wasm": "application/wasm",
	".webp": "image/webp",
	".xml":  "text/xml; charset=utf-8",
}

I think some guidance on what should be included in this would be good, rather than a consumer of the package not realizing there are arbitrary gaps. In the meantime I will submit a PR that will incorporate all MDN defined "Common Types" (which also I have to admit is arbitrary, but at least covers more common usecases.)

@gopherbot gopherbot added this to the Proposal milestone Sep 19, 2024
@seankhliao
Copy link
Member

what's included is based on WHATWG mime sniffing
https://mimesniff.spec.whatwg.org/
this gives us a clear spec to adhere to, rather than an arbitrary list.

@seankhliao seankhliao changed the title proposal: mime: Expand on what is covered by builtinTypes proposal: mime: expand on what is covered by builtinTypes Sep 19, 2024
@AidanWelch
Copy link
Author

AidanWelch commented Sep 19, 2024

@seankhliao Wow, thanks for the quick response, but I'm confused as to where that actually specifies specifically just the mime types specified in builtinTypes. From my understanding that would be more relevant for net/http's DetectContentType that is actually sniffing. But, for mime's ExtensionsByType and TypeByExtension don't we have the assumption that the file extension/type is truthful and we're trying to determine the most likely type from that- whereas sniffing wouldn't even care about the given type or extension? (And so sniffing would give most(all?) plaintext types for example the same extension/type)

AidanWelch added a commit to AidanWelch/go that referenced this issue Sep 19, 2024
Simply implements the first recommended type for each file extension listed in MDN https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

However, this excludes ".3gp" and ".3gp2" as from from I can tell it is not possible to know if it is video or audio solely from file extension.

As far as I can tell there are two previous PRs that each implemented a type simply because they were in common use.

Updates golang#69530
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/614376 mentions this issue: mime: extend "builtinTypes" to include a more complete list of common types

@gabyhelp
Copy link

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Sep 19, 2024
AidanWelch added a commit to AidanWelch/go that referenced this issue Sep 20, 2024
Comment with the source of the builtin types

Updates golang#69530
@neild
Copy link
Contributor

neild commented Sep 25, 2024

what's included is based on WHATWG mime sniffing https://mimesniff.spec.whatwg.org/ this gives us a clear spec to adhere to, rather than an arbitrary list.

net/http.DetectContentType is based on WHATWG's spec; this proposal is for the type/extension mapping used by mime.TypeByExtension and other functions in the mime package when the system MIME database (/etc/mime.types or similar) isn't present.

@milhoan
Copy link

milhoan commented Oct 17, 2024

Per conversation here whatwg/mimesniff#51 (comment), the intent of the Mimesniff spec is

"Based on the recent trajectory of changes to this spec, it seems to me that the scope of the spec is client-side sniffing for cross-browser compatibility and protection for the user against malicious files"

Mimesniff spec is not an appropriate spec for a http server use case. It would be better to adopt a different spec for this.

Alternatively, a new function that is server side appropriate that implements a different spec is needed. (EDIT: This comment was regarding DetectContentType, not TypeByExtension)

@AidanWelch
Copy link
Author

@milhoan But as of now, this doesn't mimesniff. It just maps file extensions to mime types

@milhoan
Copy link

milhoan commented Oct 17, 2024

@milhoan But as of now, this doesn't mimesniff. It just maps file extensions to mime types

Sorry, I saw the discussion above about DetectContentType being based on that spec(imo it should not be). Disregard my comment as this is not about that function. I'm 100% in favor of more mime type coverage for TypeByExtension

@seankhliao
Copy link
Member

Looking at what the browsers do for matching file extensions to mime type:

Chromium https://chromium.googlesource.com/chromium/src/+/master/net/base/mime_util.cc#129
Maintains a primary and secondary mapping, with the preference order being: primary, platform, secondary.

Firefox https://searchfox.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#2968
list at https://searchfox.org/mozilla-central/source/uriloader/exthandler/nsExternalHelperAppService.cpp#455
const defs https://searchfox.org/mozilla-central/source/netwerk/mime/nsMimeTypes.h
Maintains a default and extra mapping, with the preference order being: default, platform, extras.

Below is a table mapping file extensions to go mime types and chromium / firefox inclusion in primary (1) or secondary (2) lists, and their mime type if it differs from what go has.

extension go mime type chrome firefox
3g2 2 (video/3gpp2)
3gp 2 (video/3gpp)
3gpp 2 (video/3gpp)
aac 2 (audio/aac)
ai 2 (application/postscript) 2 (application/postscript)
apk 2 (application/vnd.android.package-archive) 2 (application/vnd.android.package-archive)
apng 1 (image/apng) 2 (image/apng)
appcache 2 (text/cache-manifest)
arj 2 (application/x-arj)
art 2 (image/x-jg)
avif image/avif 1 2
bin 2 (application/octet-stream) 2 (application/octet-stream)
bmp 2 (image/bmp) 2 (image/bmp)
cer 2 (application/x-x509-ca-cert)
com 2 (application/octet-stream) 2 (application/octet-stream)
crt 2 (application/x-x509-ca-cert)
crx 1 (application/x-chrome-extension)
css text/css 1 2
csv 1 (text/csv) 2 (text/csv)
cur 2 (image/x-icon)
doc 2 (application/msword) 2 (application/msword)
docx 2 (application/vnd.openxmlformats-officedocument.wordprocessingml.document) 2 (application/vnd.openxmlformats-officedocument.wordprocessingml.document)
dot 2 (application/msword)
ehtml 2 (text/html) 2 (text/html)
eml 2 (message/rfc822) 2 (message/rfc822)
eps 2 (application/postscript) 2 (application/postscript)
epub 2 (application/epub+zip)
exe 2 (application/octet-stream) 2 (application/octet-stream)
flac 1 (audio/flac) 2 (audio/flac)
ftl 1 (text/plain)
gif image/gif 1 2
gz 2 (application/x-gzip) 2 (application/gzip)
htm text/html 1 2
html text/html 1 2
ical 2 (text/calendar)
icalendar 2 (text/calendar)
ico 2 (image/vnd.microsoft.icon) 2 (image/x-icon)
ics 2 (text/calendar) 2 (text/calendar)
ifb 2 (text/calendar)
jfif 2 (image/jpeg) 2 (image/jpeg)
jpeg image/jpeg 1 2
jpg image/jpeg 1 2
js text/javascript 2 (application/javascript) 2 (application/x-javascript)
jsm 2 (application/x-javascript)
json application/json 2 2
jxl 2 (image/jxl)
locale 1 (text/plain)
m3u8 2 (application/x-mpegurl)
m4a 1 (audio/x-m4a) 2 (audio/mp4)
m4b 2 (audio/mp4)
m4v 1 (video/mp4)
mht 1 (multipart/related)
mhtml 1 (multipart/related)
mid 2 (audio/x-midi)
mjs text/javascript 1 2 (application/x-javascript)
mml 2 (application/mathml+xml)
mp2 2 (audio/mpeg)
mp3 1 (audio/mp3) 2 (audio/mpeg)
mp4 1 (video/mp4) 2 (video/mp4)
mpeg 2 (video/mpeg)
mpega 2 (audio/mpeg)
mpg 2 (video/mpeg)
odg 2 (application/vnd.oasis.opendocument.graphics)
odp 2 (application/vnd.oasis.opendocument.presentation)
ods 2 (application/vnd.oasis.opendocument.spreadsheet)
odt 2 (application/vnd.oasis.opendocument.text)
oga 1 (audio/ogg) 2 (audio/ogg)
ogg 1 (audio/ogg) 2 (application/ogg)
ogm 1 (video/ogg)
ogv 1 (video/ogg) 2 (video/ogg)
opus 1 (audio/ogg) 2 (audio/ogg)
p7c 2 (application/pkcs7-mime)
p7m 2 (application/pkcs7-mime)
p7s 2 (application/pkcs7-signature)
p7z 2 (application/pkcs7-mime)
pdf application/pdf 2 2
pjp 2 (image/jpeg) 2 (image/jpeg)
pjpeg 2 (image/jpeg) 2 (image/jpeg)
png image/png 2 (image/x-png) 2
ppt 2 (application/vnd.ms-powerpoint) 2 (application/vnd.ms-powerpoint)
pptx 2 (application/vnd.openxmlformats-officedocument.presentationml.presentation) 2 (application/vnd.openxmlformats-officedocument.presentationml.presentation)
properties 1 (text/plain)
ps 2 (application/postscript) 2 (application/postscript)
rdf 2 (application/rdf+xml) 2 (application/rdf+xml)
rss 2 (application/rss+xml)
rtf 2 (application/rtf) 2 (application/rtf)
sh 2 (text/x-sh)
shtm 1 (text/html)
shtml 1 (text/html) 2 (text/html)
svg image/svg+xml 1 2
svgz 1 (image/svg+xml)
swf 2 (application/x-shockwave-flash)
swl 2 (application/x-shockwave-flash)
tar 2 (application/x-tar)
text 2 (text/plain) 2 (text/plain)
tgz 2 (application/x-gzip)
tif 2 (image/tiff) 2 (image/tiff)
tiff 2 (image/tiff) 2 (image/tiff)
txt 2 (text/plain) 2 (text/plain)
vcard 2 (text/vcard)
vcf 2 (text/vcard)
vtt 2 (text/vtt) 2 (text/vtt)
wasm application/wasm 1 2
wav 1 (audio/wav) 2 (audio/x-wav)
weba 2 (audio/webm)
webm 1 (audio/webm) 2 (audio/webm)
webp image/webp 1 2
woff 2 (application/font-woff)
xbl 2 (text/xml) 2 (text/xml)
xbm 2 (image/x-xbitmap) 2 (image/x-xbitmap)
xht 1 (application/xhtml+xml) 2 (application/xhtml+xml)
xhtm 1 (application/xhtml+xml)
xhtml 1 (application/xhtml+xml) 2 (application/xhtml+xml)
xls 2 (application/vnd.ms-excel) 2 (application/vnd.ms-excel)
xlsx 2 (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) 2 (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
xml text/xml 1 2
xpi 2 (application/x-xpinstall)
xsl 2 (text/xml) 2 (text/xml)
xslt 2 (text/xml)
xul 2 (application/vnd.mozilla.xul+xml)
yuv 2 (video/x-raw-yuv)
zip 2 (application/zip) 2 (application/zip)

@seankhliao
Copy link
Member

If we are to add more, I propose we limit it to what both browsers have decided to include in their built in lists.

@AidanWelch
Copy link
Author

That sounds good to me, I can update the PR if that is what's decided on

@neild
Copy link
Contributor

neild commented Nov 18, 2024

Interestingly, the one case where we override the platform value (on Windows, we ignore a registry entry mapping .js to text/plain) is one where Chrome and Firefox apparently prefer the platform setting.

Limiting our list of builtin mappings to what both Chrome and Firefox include seems reasonably principled. I'd support that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

6 participants