java中日期(Date),时区(TimeZone),夏令时(daylight)以及地区(Locale),日历(Calendar), DateFormat,fasttime之间的关系整理,以及一些开源的Utils工具类。

java的Calendar, TimeZone都是抽象类。具体的子类有GregorianCalendar, JuliaCalendar和JapaneseImperialCalendar以及SimpleTimeZone。有待某位国人来写个Chinese Lunar Calendar子类。

Calendar有一个TimeZone成员变量,而TimeZone又有一个Locale成员变量。而default Locale取决于系统环境。

Calendar由时区和地区决定。

Date与Calendar以及DateFormat的关系

参看下面的javaDoc的解释:在jdk1.1之前Date有两个功能,一个是对年月日时分秒值的解释,另一个是对日期字符的format和parse。不过这些功能在国际化多时区等因素下并不是太好。所以在jdk1.1之后把这些功能给分了出来,分别交给了Calendar和DateFormat。之前Date相关的这类功能就被deprecated了。不过目前为止还没有被移除remove, 这类方法最好不用,或只用于一些简单测试使用。

jdk1.1之后Date只有两个未被deprecated的构造函数 Date()和Date(long date)。这里的long date就是fastTime。另外还有一些未被deprecated的方法都是比较大小或者获取设置fastTime的:before(Date when), after(Date when), clone(), compareTo(Date anotherDate), equals(Object obj), getTime(), setTime(long time), toString()。

Prior to JDK 1.1, the class Date had two additional functions. It allowed the interpretation of dates as year, month, day, hour, minute, and second values. It also allowed the formatting and parsing of date strings. Unfortunately, the API for these functions was not amenable to internationalization. As of JDK 1.1, the Calendar class should be used to convert between dates and time fields and the DateFormat class should be used to format and parse date strings. The corresponding methods in Date are deprecated.

Java中一个deprecated 的日期构造函数new Date(String date). 关于这个string 其实是交给了另一个deprecated的日期方法parse(String date). 这里是相关解释以供参考。不建议使用,或仅用在简单测试代码中。

Attempts to interpret the string s as a representation of a date and time. If the attempt is successful, the time indicated is returned represented as the distance, measured in milliseconds, of that time from the epoch (00:00:00 GMT on January 1, 1970). If the attempt fails, an IllegalArgumentException is thrown.
It accepts many syntaxes; in particular, it recognizes the IETF standard date syntax: "Sat, 12 Aug 1995 13:30:00 GMT". It also understands the continental U.S. time-zone abbreviations, but for general use, a time-zone offset should be used: "Sat, 12 Aug 1995 13:30:00 GMT+0430" (4 hours, 30 minutes west of the Greenwich meridian). If no time zone is specified, the local time zone is assumed. GMT and UTC are considered equivalent.

The string s is processed from left to right, looking for data of interest. Any material in s that is within the ASCII parenthesis characters ( and ) is ignored. Parentheses may be nested. Otherwise, the only characters permitted within s are these ASCII characters:

 abcdefghijklmnopqrstuvwxyz
 ABCDEFGHIJKLMNOPQRSTUVWXYZ
 0123456789,+-:/
and whitespace characters.
A consecutive sequence of decimal digits is treated as a decimal number:

If a number is preceded by + or - and a year has already been recognized, then the number is a time-zone offset. If the number is less than 24, it is an offset measured in hours. Otherwise, it is regarded as an offset in minutes, expressed in 24-hour time format without punctuation. A preceding - means a westward offset. Time zone offsets are always relative to UTC (Greenwich). Thus, for example, -5 occurring in the string would mean "five hours west of Greenwich" and +0430 would mean "four hours and thirty minutes east of Greenwich." It is permitted for the string to specify GMT, UT, or UTC redundantly-for example, GMT-5 or utc+0430.
The number is regarded as a year number if one of the following conditions is true:
The number is equal to or greater than 70 and followed by a space, comma, slash, or end of string
The number is less than 70, and both a month and a day of the month have already been recognized
If the recognized year number is less than 100, it is interpreted as an abbreviated year relative to a century of which dates are within 80 years before and 19 years after the time when the Date class is initialized. After adjusting the year number, 1900 is subtracted from it. For example, if the current year is 1999 then years in the range 19 to 99 are assumed to mean 1919 to 1999, while years from 0 to 18 are assumed to mean 2000 to 2018. Note that this is slightly different from the interpretation of years less than 100 that is used in SimpleDateFormat.
If the number is followed by a colon, it is regarded as an hour, unless an hour has already been recognized, in which case it is regarded as a minute.
If the number is followed by a slash, it is regarded as a month (it is decreased by 1 to produce a number in the range 0 to 11), unless a month has already been recognized, in which case it is regarded as a day of the month.
If the number is followed by whitespace, a comma, a hyphen, or end of string, then if an hour has been recognized but not a minute, it is regarded as a minute; otherwise, if a minute has been recognized but not a second, it is regarded as a second; otherwise, it is regarded as a day of the month.
A consecutive sequence of letters is regarded as a word and treated as follows:

A word that matches AM, ignoring case, is ignored (but the parse fails if an hour has not been recognized or is less than 1 or greater than 12).
A word that matches PM, ignoring case, adds 12 to the hour (but the parse fails if an hour has not been recognized or is less than 1 or greater than 12).
Any word that matches any prefix of SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, or SATURDAY, ignoring case, is ignored. For example, sat, Friday, TUE, and Thurs are ignored.
Otherwise, any word that matches any prefix of JANUARY, FEBRUARY, MARCH, APRIL, MAY, JUNE, JULY, AUGUST, SEPTEMBER, OCTOBER, NOVEMBER, or DECEMBER, ignoring case, and considering them in the order given here, is recognized as specifying a month and is converted to a number (0 to 11). For example, aug, Sept, april, and NOV are recognized as months. So is Ma, which is recognized as MARCH, not MAY.
Any word that matches GMT, UT, or UTC, ignoring case, is treated as referring to UTC.
Any word that matches EST, CST, MST, or PST, ignoring case, is recognized as referring to the time zone in North America that is five, six, seven, or eight hours west of Greenwich, respectively. Any word that matches EDT, CDT, MDT, or PDT, ignoring case, is recognized as referring to the same time zone, respectively, during daylight saving time.
Once the entire string s has been scanned, it is converted to a time result in one of two ways. If a time zone or time-zone offset has been recognized, then the year, month, day of month, hour, minute, and second are interpreted in UTC and then the time-zone offset is applied. Otherwise, the year, month, day of month, hour, minute, and second are interpreted in the local time zone.

 java.util.Calendar

Calendar存在很多知识点,可以参看下面的javaDoc.我们用的大部分都是GregorianCalendar,常说的乔治日历,阳历,格力高利历。他是基于1970 1/1 00:00:00 GMT时区的日历算法。而与我们老祖宗留下的农历(又叫夏历,阴历,旧历)不一样。

1970年1月1日 00:00:00作为新世纪元年首日。源码中EPOCH_YEAR就是1970,而EPOCH_OFFSET的值为719163。这个值是从0000年1月1日到1970年1月1日的总天数。有了个数就可以用来计算fasttime来获得当前日期。

这里还有一个知识点就是JulianCalendar与GregorianCalendar之间的一个cutover(-12219292800000L)。JulianCalendar的1582年10月1日后面就是紧跟着GregorianCalendar的1582年10月15日。有了这个知识点才能更好的读懂源码。

Calendar由时区,地区,以及日期来决定。不同地区采用不同的历法,时区影响来日期在日历中的本地化。

这边是JDK创建Calendar的方法:

 

The Calendar class is an abstract class that provides methods for converting between a specific instant in time and a set of calendar fields such as YEAR, MONTH, DAY_OF_MONTH, HOUR, and so on, and for manipulating the calendar fields, such as getting the date of the next week. An instant in time can be represented by a millisecond value that is an offset from the Epoch, January 1, 1970 00:00:00.000 GMT (Gregorian).
The class also provides additional fields and methods for implementing a concrete calendar system outside the package. Those fields and methods are defined as protected.

Like other locale-sensitive classes, Calendar provides a class method, getInstance, for getting a generally useful object of this type. Calendar's getInstance method returns a Calendar object whose calendar fields have been initialized with the current date and time:

     Calendar rightNow = Calendar.getInstance();
 
A Calendar object can produce all the calendar field values needed to implement the date-time formatting for a particular language and calendar style (for example, Japanese-Gregorian, Japanese-Traditional). Calendar defines the range of values returned by certain calendar fields, as well as their meaning. For example, the first month of the calendar system has value MONTH == JANUARY for all calendars. Other values are defined by the concrete subclass, such as ERA. See individual field documentation and subclass documentation for details.

Getting and Setting Calendar Field Values

The calendar field values can be set by calling the set methods. Any field values set in a Calendar will not be interpreted until it needs to calculate its time value (milliseconds from the Epoch) or values of the calendar fields. Calling the get, getTimeInMillis, getTime, add and roll involves such calculation.

Leniency

Calendar has two modes for interpreting the calendar fields, lenient and non-lenient. When a Calendar is in lenient mode, it accepts a wider range of calendar field values than it produces. When a Calendar recomputes calendar field values for return by get(), all of the calendar fields are normalized. For example, a lenient GregorianCalendar interprets MONTH == JANUARY, DAY_OF_MONTH == 32 as February 1.

When a Calendar is in non-lenient mode, it throws an exception if there is any inconsistency in its calendar fields. For example, a GregorianCalendar always produces DAY_OF_MONTH values between 1 and the length of the month. A non-lenient GregorianCalendar throws an exception upon calculating its time or calendar field values if any out-of-range field value has been set.

First Week

Calendar defines a locale-specific seven day week using two parameters: the first day of the week and the minimal days in first week (from 1 to 7). These numbers are taken from the locale resource data when a Calendar is constructed. They may also be specified explicitly through the methods for setting their values.
When setting or getting the WEEK_OF_MONTH or WEEK_OF_YEAR fields, Calendar must determine the first week of the month or year as a reference point. The first week of a month or year is defined as the earliest seven day period beginning on getFirstDayOfWeek() and containing at least getMinimalDaysInFirstWeek() days of that month or year. Weeks numbered ..., -1, 0 precede the first week; weeks numbered 2, 3,... follow it. Note that the normalized numbering returned by get() may be different. For example, a specific Calendar subclass may designate the week before week 1 of a year as week n of the previous year.

Calendar Fields Resolution

When computing a date and time from the calendar fields, there may be insufficient information for the computation (such as only year and month with no day of month), or there may be inconsistent information (such as Tuesday, July 15, 1996 (Gregorian) -- July 15, 1996 is actually a Monday). Calendar will resolve calendar field values to determine the date and time in the following way.
If there is any conflict in calendar field values, Calendar gives priorities to calendar fields that have been set more recently. The following are the default combinations of the calendar fields. The most recent combination, as determined by the most recently set single field, will be used.

For the date fields:

 YEAR + MONTH + DAY_OF_MONTH
 YEAR + MONTH + WEEK_OF_MONTH + DAY_OF_WEEK
 YEAR + MONTH + DAY_OF_WEEK_IN_MONTH + DAY_OF_WEEK
 YEAR + DAY_OF_YEAR
 YEAR + DAY_OF_WEEK + WEEK_OF_YEAR
 
For the time of day fields:
 HOUR_OF_DAY
 AM_PM + HOUR
 
If there are any calendar fields whose values haven't been set in the selected field combination, Calendar uses their default values. The default value of each field may vary by concrete calendar systems. For example, in GregorianCalendar, the default of a field is the same as that of the start of the Epoch: i.e., YEAR = 1970, MONTH = JANUARY, DAY_OF_MONTH = 1, etc.

Note: There are certain possible ambiguities in interpretation of certain singular times, which are resolved in the following ways:

23:59 is the last minute of the day and 00:00 is the first minute of the next day. Thus, 23:59 on Dec 31, 1999 < 00:00 on Jan 1, 2000 < 00:01 on Jan 1, 2000.
Although historically not precise, midnight also belongs to "am", and noon belongs to "pm", so on the same day, 12:00 am (midnight) < 12:01 am, and 12:00 pm (noon) < 12:01 pm
The date or time format strings are not part of the definition of a calendar, as those must be modifiable or overridable by the user at runtime. Use DateFormat to format dates.

Field Manipulation

The calendar fields can be changed using three methods: set(), add(), and roll().
set(f, value) changes calendar field f to value. In addition, it sets an internal member variable to indicate that calendar field f has been changed. Although calendar field f is changed immediately, the calendar's time value in milliseconds is not recomputed until the next call to get(), getTime(), getTimeInMillis(), add(), or roll() is made. Thus, multiple calls to set() do not trigger multiple, unnecessary computations. As a result of changing a calendar field using set(), other calendar fields may also change, depending on the calendar field, the calendar field value, and the calendar system. In addition, get(f) will not necessarily return value set by the call to the set method after the calendar fields have been recomputed. The specifics are determined by the concrete calendar class.

Example: Consider a GregorianCalendar originally set to August 31, 1999. Calling set(Calendar.MONTH, Calendar.SEPTEMBER) sets the date to September 31, 1999. This is a temporary internal representation that resolves to October 1, 1999 if getTime()is then called. However, a call to set(Calendar.DAY_OF_MONTH, 30) before the call to getTime() sets the date to September 30, 1999, since no recomputation occurs after set() itself.

add(f, delta) adds delta to field f. This is equivalent to calling set(f, get(f) + delta) with two adjustments:

Add rule 1. The value of field f after the call minus the value of field f before the call is delta, modulo any overflow that has occurred in field f. Overflow occurs when a field value exceeds its range and, as a result, the next larger field is incremented or decremented and the field value is adjusted back into its range.

Add rule 2. If a smaller field is expected to be invariant, but it is impossible for it to be equal to its prior value because of changes in its minimum or maximum after field f is changed or other constraints, such as time zone offset changes, then its value is adjusted to be as close as possible to its expected value. A smaller field represents a smaller unit of time. HOUR is a smaller field than DAY_OF_MONTH. No adjustment is made to smaller fields that are not expected to be invariant. The calendar system determines what fields are expected to be invariant.

In addition, unlike set(), add() forces an immediate recomputation of the calendar's milliseconds and all fields.

Example: Consider a GregorianCalendar originally set to August 31, 1999. Calling add(Calendar.MONTH, 13) sets the calendar to September 30, 2000. Add rule 1 sets the MONTH field to September, since adding 13 months to August gives September of the next year. Since DAY_OF_MONTH cannot be 31 in September in a GregorianCalendar, add rule 2 sets the DAY_OF_MONTH to 30, the closest possible value. Although it is a smaller field, DAY_OF_WEEK is not adjusted by rule 2, since it is expected to change when the month changes in a GregorianCalendar.

roll(f, delta) adds delta to field f without changing larger fields. This is equivalent to calling add(f, delta) with the following adjustment:

Roll rule. Larger fields are unchanged after the call. A larger field represents a larger unit of time. DAY_OF_MONTH is a larger field than HOUR.

Example: See GregorianCalendar.roll(int, int).

Usage model. To motivate the behavior of add() and roll(), consider a user interface component with increment and decrement buttons for the month, day, and year, and an underlying GregorianCalendar. If the interface reads January 31, 1999 and the user presses the month increment button, what should it read? If the underlying implementation uses set(), it might read March 3, 1999. A better result would be February 28, 1999. Furthermore, if the user presses the month increment button again, it should read March 31, 1999, not March 28, 1999. By saving the original date and using either add() or roll(), depending on whether larger fields should be affected, the user interface can behave as most users will intuitively expect.

Date和fastTime

每个Date都有一个fastTime成员变量

DateFormat

International Standard Dateformat是ISO 8601. 这个国际标准描述了很多日期和时间的格式。 W3c在其基础上了简化使用ISO8601,定义了一个更小的适合www的dateformat.

 

   Year:
      YYYY (eg 1997)
   Year and month:
      YYYY-MM (eg 1997-07)
   Complete date:
      YYYY-MM-DD (eg 1997-07-16)
   Complete date plus hours and minutes:
      YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
   Complete date plus hours, minutes and seconds:
      YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
   Complete date plus hours, minutes, seconds and a decimal fraction of a
second
      YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)
where:

     YYYY = four-digit year
     MM   = two-digit month (01=January, etc.)
     DD   = two-digit day of month (01 through 31)
     hh   = two digits of hour (00 through 23) (am/pm NOT allowed)
     mm   = two digits of minute (00 through 59)
     ss   = two digits of second (00 through 59)
     s    = one or more digits representing a decimal fraction of a second
     TZD  = time zone designator (Z or +hh:mm or -hh:mm)

参考文档

  • http://www.w3.org/TR/NOTE-datetime
  • http://stackoverflow.com/questions/23975205/why-does-converting-java-dates-before-1582-to-localdate-with-instant-give-a-diff
  •  

发表评论