textmate

mirror of https://github.com/textmate/textmate.git synced 2026-01-13 00:37:57 -05:00

Author	SHA1	Message	Date
Allan Odgaard	0520e4fe88	Add text::transcode_t which is a wrapper for iconv Apart from being simpler to use this wrapper supports adding ‘//BOM’ to the charset name to either consume or produce a byte order marker. It also converts invalid byte sequences to (ASCII) escape codes, e.g. \x8F.	2016-06-21 18:31:29 +02:00
Allan Odgaard	7675aeb4ec	Changing case would truncate the result if it grew in size	2014-06-28 17:42:22 +02:00
Allan Odgaard	cf452cdcee	Increase the number of tests for sanitizing UTF-8 Also harmonize the formatting of the existing tests.	2014-04-01 16:01:19 +07:00
Allan Odgaard	1840f5b7fa	Improve utf8::find_safe_end implementation Previously calling the function with invalid UTF-8 could cause it to iterate over all the data and, if built in debug mode, could cause an assertion failure. Now we return the sequence’s end when the data appears to be malformed and we never look at more than the last 6 bytes in the sequence.	2014-03-03 13:48:12 +07:00
Allan Odgaard	c2397484b8	Use C++11 for loop Majority of the edits done using the following ruby script: def update_loops(src) dst, cnt = '', 0 block_indent, variable = nil, nil src.each_line do \|line\| if block_indent if line =~ /^#{block_indent}([{}\t])\|^\t$/ block_indent = nil if $1 == '}' line = line.gsub(%r{ ([^a-z>]) $\#{variable}$ \| \#{variable}\b \| \b#{variable}(->) }x) do $1.to_s + variable + ($2 == "->" ? "." : "") end else block_indent = nil end elsif line =~ /^(\t)c?iterate$(\w+), (?!diacritics::make_range)(.*$)$/ block_indent, variable = $1, $2 line = "#$1for(auto const& #$2 : #$3\n" cnt += 1 end dst << line end return dst, cnt end paths.each do \|path\| src = IO.read(path) cnt = 1 while cnt != 0 src, cnt = update_loops(src) STDERR << "#{path}: #{cnt}\n" end File.open(path, "w") { \|io\| io << src } end	2014-03-03 10:34:13 +07:00
Allan Odgaard	2fa5d7ddb2	Add UTF-8 sanitization function This can be used to remove malformed multibyte sequences.	2013-10-08 21:59:54 +02:00
Allan Odgaard	b7bc35ed9d	Let decode::url_part convert plus to space	2013-08-29 13:26:16 +02:00
Allan Odgaard	f05426378c	Update testing system for text framework	2013-07-26 13:53:58 +02:00
Allan Odgaard	20378c426e	A full match should rank highest	2013-01-18 13:34:57 +01:00
Allan Odgaard	ebab500ba3	Use std::map/set instead of C arrays These types come with a find() method and avoids having to use helper functions to get the begin/end of the array (for linear search).	2012-09-20 12:22:20 +02:00
Allan Odgaard	45f847d01e	Add text::is_east_asian_width This checks if the character needs to be counted as double-width (for soft wrap and similar). I used the following script to generate the tables, it should be improved to collapse the ranges: #!/usr/bin/ruby fixed, start, stop = [ ], [ ], [ ] open('\|curl -Ls http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt') do \|io\| io.grep(/^([0-9A-F]+)(?:..([0-9A-F]+))?;[A-Za-z]*W/) do if $2 start << "0x#$1" stop << "0x#$2" else fixed << "0x#$1" end end end puts "static uint32_t Fixed[] = { #{fixed.join(', ')} };\n" puts "static uint32_t RangeBegin[] = { #{start.join(', ')} };\n" puts "static uint32_t RangeEnd[] = { #{stop.join(', ')} };\n"	2012-08-18 21:29:05 +02:00
Allan Odgaard	9894969e67	Initial commit	2012-08-09 16:25:56 +02:00

12 Commits