class Programmer : IEnableMuch {: PostgreSQL

Showing posts with label PostgreSQL. Show all posts

Wednesday, October 8, 2014

BCrypt Primer

Good thing I saw this Postgresql bcrypt use on stackoverflow first..

do
$$
declare hashed text;
begin
    // hashed is what is stored in database
    hashed := crypt('passwordToTest', gen_salt('bf'));

    // then check user login password against the saved hash
    if hashed = crypt('passwordToTest', hashed) then
        raise notice 'matches';
    else
        raise notice 'didn''t match';
    end if;
end;
$$ language 'plpgsql';

..before this BCrypt example for .NET from codeproject:

string myPassword = "passwordToTest";
string mySalt = BCrypt.GenerateSalt();
//mySalt == "$2a$10$rBV2JDeWW3.vKyeQcM8fFO"
string myHash = BCrypt.HashPassword(myPassword, mySalt);
//myHash == "$2a$10$rBV2JDeWW3.vKyeQcM8fFO4777l4bVeQgDL6VIkxqlzQ7TCalQvla"
bool doesPasswordMatch = BCrypt.CheckPassword(myPassword, myHash);

Otherwise, I'll think CheckPassword is a magical functionality of bcrypt. On the latest version of BCrypt.NET from Nuget, the CheckPassword functionality is missing. Seeing how bcrypt hashing and checking works (via PostgreSQL example), CheckPassword is just a simple code:

[TestClass]
public class TheUnitTest
{
    [TestMethod]
    public void Test_if_password_matched()
    {
        // Arrange
        // hashed is what is stored in database
        string hashed = BCrypt.Net.BCrypt.HashPassword("passwordToTest", BCrypt.Net.BCrypt.GenerateSalt(12));

        // Act
        // then check user login password against the saved hash
        bool matches = hashed == BCrypt.Net.BCrypt.HashPassword("passwordToTest", hashed);

        // Assert
        Assert.IsTrue(matches);
    }
}

Happy Coding!

Wednesday, November 28, 2012

Linq's syntax symmetry with Postgres

Can't help but notice, Linq multiple condition has more affinity with Postgres flexible syntax:

from c in Customers
join o in Orders on new {c.CompanyID, c.CustomerID} equals new {o.CompanyID, o.CustomerID}

Postgresql:

from Customers as c 
join Orders as o on (c.CompanyID,c.CustomerID) = (o.CompanyID,o.CustomerID)

Saturday, April 30, 2011

Postgresql pain point on ORMs

Is it a breaking changes if Postgresql will make their table and field names case-insensitive? Postgresql table and field cases automatically folded to lowercase hurts badly on ORMs

Use this for the meantime:

public static string ToLowercaseNamingConvention(this string s, bool toLowercase)
{
    if (toLowercase)
    {
        var r = new Regex(@"
        (?<=[A-Z])(?=[A-Z][a-z]) |
         (?<=[^A-Z])(?=[A-Z]) |
         (?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);

        return r.Replace(s, "_").ToLower();
    }
    else
        return s;
}

Sample here: http://code.google.com/p/fluent-nhibernate-lowercase-system/source/checkout

Sunday, February 6, 2011

How to find the second best grade of a student. Using Postgresql user-defined aggregate

You can define your own aggregate function in Postgres, let's say you want to get only two array elements from a set of rows, you can use this (this violates DestroyCity principle though, but we are constrained by the fact that user-defined-aggregates accepts function with two parameters only, so there's nothing we can do about it):

create aggregate two_elements(anyelement)
(
sfunc = array_limit_two,
stype = anyarray,
initcond = '{}'
);

create or replace function array_limit_two(anyarray, anyelement) returns anyarray
as 
$$
begin
 if array_upper($1,1) = 2 then
  return $1;
 else
  return array_append($1, $2);
 end if;
end;
$$ language 'plpgsql';

Get the data here: How to find the second best grade of a student?

Then this...

select student, two_elements(grade order by grade desc)
from 
student_grades
group by student;

...will return(two best grades of a student):

student | two_elements 
---------+--------------
 george  | {40}
 john    | {100,90}
 paul    | {50,30}

To return the second best grades only, use this:

select student, array_min( two_elements(grade order by grade desc) )
from 
student_grades
group by student;

Here's the array_min function:

create or replace function array_min(anyarray) returns anyelement
as
$$
select min(unnested) from( select unnest($1) unnested ) as x
$$ language sql;

 student | array_min 
---------+-----------
 george  |        40
 john    |        90
 paul    |        30
(3 rows)

I think this code is more performant than the previous one

Wisdom of the day:

Unless you actually are going to solve the general problem, don't try and put in place a framework for solving a specific one, because you don't know what that framework should look like.

Wisdom source: http://tirania.org/blog/archive/2003/Sep-29.html

How to find the second best grade of a student?

How to find the second best grade of the student? (If there's no second grade, just return the first one)

create table student_grades
(
student varchar(100),
grade int
);

insert into student_grades values 
('john',70),
('john',80),
('john',90),
('john',100);


insert into student_grades values
('paul',20),
('paul',10),

('paul',50),
('paul',30);


insert into student_grades values
('george',40);

Desired output:

 student | grade 
---------+-------
 george  |    40
 john    |    90
 paul    |    30
(3 rows)

Here's the code:

with ranking as
(
 select student, grade, rank() over(partition by student order by grade DESC) as place
 from 
 student_grades
)
select student, grade
from
ranking x
where 
exists (
  select null 
  from ranking
  where place <= 2
   and student = x.student 
  group by student
  having max(place) = x.place
 )

The following is shorter(works on Postgresql, Oracle) and simpler to read than above:

with ranking as
(
 select student, grade, rank() over(partition by student order by grade DESC) as place
 from 
 student_grades
)
select student, grade
from
ranking 
where (student, place) in
 (
  select student, max(place) 
  from ranking
  where place <= 2
  group by student  
 )

If only Postgres allows LIMIT clause inside an aggregation and made array_max a built-in function, this could be the shortest code:

select student, array_max( array_agg(grade order by grade desc limit 2) )
from 
student_grades
group by student;

Friday, January 14, 2011

Postgresql LAG windowing function

Given this:

create table t
(
ts timestamp not null,
code int not null
);

insert into t values
('2011-01-13 10:00:00', 5),
('2011-01-13 10:10:00', 5),
('2011-01-13 10:20:00', 5),
('2011-01-13 10:30:00', 5),
('2011-01-13 10:40:00', 0),
('2011-01-13 10:50:00', 1),
('2011-01-13 11:00:00', 1),
('2011-01-13 11:10:00', 1),
('2011-01-13 11:20:00', 0),
('2011-01-13 11:30:00', 5),
('2011-01-13 11:40:00', 5),
('2011-01-13 11:50:00', 3),
('2011-01-13 12:00:00', 3),
('2011-01-13 12:10:00', 3);

An stackoverflow user asked:

How can I select the first date of each set of identical numbers, so I end up with this:
2011-01-13 10:00:00, 5
2011-01-13 10:40:00, 0
2011-01-13 10:50:00, 1
2011-01-13 11:20:00, 0
2011-01-13 11:30:00, 5
2011-01-13 11:50:00, 3

I answered:

with sequencer as 
(
SELECT ROW_NUMBER() OVER(ORDER BY ts) seq, ts, Code
FROM T
)
select a.ts, a.Code
from sequencer a 
left join sequencer b on b.seq + 1 = a.seq 
where b.code IS DISTINCT FROM a.code;

But I'm not giving the problem a first-class treatment it deserves, I noticed somebody gave an answer that uses lag function, albeit he/she uses where prevcode <> code or prevcode is null

with r as
(
select ts, code, lag(code,1,null) over(order by ts) as prevcode
from t
)
select ts, code 
from r
where prevcode is distinct from code

Hmm.. :-) that code is very elegant, sleek and performant

http://www.postgresql.org/docs/8.4/static/functions-window.html

Thursday, December 23, 2010

Date functions

Michael-Buens-MacBook:~ Michael$ cal 2010
                             2010

      January               February               March
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
                1  2      1  2  3  4  5  6      1  2  3  4  5  6
 3  4  5  6  7  8  9   7  8  9 10 11 12 13   7  8  9 10 11 12 13
10 11 12 13 14 15 16  14 15 16 17 18 19 20  14 15 16 17 18 19 20
17 18 19 20 21 22 23  21 22 23 24 25 26 27  21 22 23 24 25 26 27
24 25 26 27 28 29 30  28                    28 29 30 31
31                                          
       April                  May                   June
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
             1  2  3                     1         1  2  3  4  5
 4  5  6  7  8  9 10   2  3  4  5  6  7  8   6  7  8  9 10 11 12
11 12 13 14 15 16 17   9 10 11 12 13 14 15  13 14 15 16 17 18 19
18 19 20 21 22 23 24  16 17 18 19 20 21 22  20 21 22 23 24 25 26
25 26 27 28 29 30     23 24 25 26 27 28 29  27 28 29 30
                      30 31                 
        July                 August              September
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
             1  2  3   1  2  3  4  5  6  7            1  2  3  4
 4  5  6  7  8  9 10   8  9 10 11 12 13 14   5  6  7  8  9 10 11
11 12 13 14 15 16 17  15 16 17 18 19 20 21  12 13 14 15 16 17 18
18 19 20 21 22 23 24  22 23 24 25 26 27 28  19 20 21 22 23 24 25
25 26 27 28 29 30 31  29 30 31              26 27 28 29 30
                                            
      October               November              December
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
                1  2      1  2  3  4  5  6            1  2  3  4
 3  4  5  6  7  8  9   7  8  9 10 11 12 13   5  6  7  8  9 10 11
10 11 12 13 14 15 16  14 15 16 17 18 19 20  12 13 14 15 16 17 18
17 18 19 20 21 22 23  21 22 23 24 25 26 27  19 20 21 22 23 24 25
24 25 26 27 28 29 30  28 29 30              26 27 28 29 30 31
31

Months' first Mondays...

#
    base_date    | month_first_monday 
-----------------+--------------------
 Jan 01 2010 Fri | Jan 04 2010 Mon
 Feb 01 2010 Mon | Feb 01 2010 Mon
 Mar 01 2010 Mon | Mar 01 2010 Mon
 Apr 01 2010 Thu | Apr 05 2010 Mon
 May 01 2010 Sat | May 03 2010 Mon
 Jun 01 2010 Tue | Jun 07 2010 Mon
 Jul 01 2010 Thu | Jul 05 2010 Mon
 Aug 01 2010 Sun | Aug 02 2010 Mon
 Sep 01 2010 Wed | Sep 06 2010 Mon
 Oct 01 2010 Fri | Oct 04 2010 Mon
 Nov 01 2010 Mon | Nov 01 2010 Mon
 Dec 01 2010 Wed | Dec 06 2010 Mon
(12 rows)

...query (month_beginning_dow):

select to_char(z.d, 'Mon dd yyyy Dy') as base_date, 

    to_char( month_beginning_dow(extract(year from z.d)::int, 
        extract(month from z.d)::int, 1), 'Mon dd yyyy Dy') 
            as month_first_monday
from
(
     select 'Jan 1 2010'::date + (interval '1' month * x.n) as d
     from generate_series(0,11) as x(n)
) as z

Months' last Sundays...

#
    base_date    | month_last_date | month_last_sunday 
-----------------+-----------------+-------------------
 Jan 01 2010 Fri | Jan 31 2010 Sun | Jan 31 2010 Sun
 Feb 01 2010 Mon | Feb 28 2010 Sun | Feb 28 2010 Sun
 Mar 01 2010 Mon | Mar 31 2010 Wed | Mar 28 2010 Sun
 Apr 01 2010 Thu | Apr 30 2010 Fri | Apr 25 2010 Sun
 May 01 2010 Sat | May 31 2010 Mon | May 30 2010 Sun
 Jun 01 2010 Tue | Jun 30 2010 Wed | Jun 27 2010 Sun
 Jul 01 2010 Thu | Jul 31 2010 Sat | Jul 25 2010 Sun
 Aug 01 2010 Sun | Aug 31 2010 Tue | Aug 29 2010 Sun
 Sep 01 2010 Wed | Sep 30 2010 Thu | Sep 26 2010 Sun
 Oct 01 2010 Fri | Oct 31 2010 Sun | Oct 31 2010 Sun
 Nov 01 2010 Mon | Nov 30 2010 Tue | Nov 28 2010 Sun
 Dec 01 2010 Wed | Dec 31 2010 Fri | Dec 26 2010 Sun
(12 rows)

...query:

select to_char(z.d, 'Mon dd yyyy Dy') as base_date,
to_char(month_last_date(extract(year from z.d)::int, 
    extract(month from z.d)::int), 'Mon dd yyyy Dy') 
        as month_last_date,
to_char( month_ending_dow(extract(year from z.d)::int, 
    extract(month from z.d)::int, 0), 'Mon dd yyyy Dy') 
        as month_last_sunday       
from
(
     select 'Jan 1 2010'::date + (interval '1' month * x.n) as d
     from generate_series(0,11) as x(n)
) as z

Date library:

CREATE FUNCTION month_last_date(_year integer, _month integer)
  RETURNS date AS
$$
 select (($1 || '-' || $2 || '-1')::date + interval '1' month)::date - 1;
$$
  LANGUAGE sql immutable;



-- this function and the next function, has nice symmetry to each other
CREATE FUNCTION date_next_dow(_base_date date, _dow integer)
  RETURNS date AS
$$
 select
  case when extract(dow from $1) <= $2 then
   $1 + ($2 - (extract(dow from $1)::int))
  else
   $1 + ($2 - (extract(dow from $1)::int)) + 7
  end 
$$
  LANGUAGE sql immutable;


CREATE FUNCTION date_previous_dow(_base_date date, _dow integer)
  RETURNS date AS
$$
 select
  case when $2 <= extract(dow from $1) then
   $1 + ($2 - (extract(dow from $1)::int))
  else
   $1 + ($2 - (extract(dow from $1)::int)) - 7
  end;
$$
  LANGUAGE sql immutable;  



CREATE FUNCTION month_beginning_dow(_year integer, _month integer, _dow integer)
  RETURNS date AS
$$
 select date_next_dow( ($1 || '-' || $2 || '-1')::date, $3);
  
$$
  LANGUAGE sql immutable;


CREATE FUNCTION month_ending_dow(_year integer, _month integer, _dow integer)
  RETURNS date AS
$$
 select date_previous_dow( month_last_date( $1, $2 ), $3);
$$
  LANGUAGE sql immutable;

Other usage, next Friday...

#
    base_date    |   next_friday   
-----------------+-----------------
 Nov 05 2010 Fri | Nov 05 2010 Fri
 Nov 06 2010 Sat | Nov 12 2010 Fri
 Nov 07 2010 Sun | Nov 12 2010 Fri
 Nov 08 2010 Mon | Nov 12 2010 Fri
 Nov 09 2010 Tue | Nov 12 2010 Fri
 Nov 10 2010 Wed | Nov 12 2010 Fri
 Nov 11 2010 Thu | Nov 12 2010 Fri
 Nov 12 2010 Fri | Nov 12 2010 Fri
 Nov 13 2010 Sat | Nov 19 2010 Fri
 Nov 14 2010 Sun | Nov 19 2010 Fri
 Nov 15 2010 Mon | Nov 19 2010 Fri
 Nov 16 2010 Tue | Nov 19 2010 Fri
 Nov 17 2010 Wed | Nov 19 2010 Fri
 Nov 18 2010 Thu | Nov 19 2010 Fri
 Nov 19 2010 Fri | Nov 19 2010 Fri
 Nov 20 2010 Sat | Nov 26 2010 Fri
 Nov 21 2010 Sun | Nov 26 2010 Fri
 Nov 22 2010 Mon | Nov 26 2010 Fri
(18 rows)

..., query:

select to_char(z.d, 'Mon dd yyyy Dy') as base_date, 
       to_char( date_next_dow(z.d, 5), 'Mon dd yyyy Dy') as next_friday
from
(
     select 'Nov 5 2010'::date + x.n as d
     from generate_series(0,17) as x(n)
) as z

Previous Saturday...

#
    base_date    |  previous_saturday  
-----------------+-------------------
 Nov 05 2010 Fri | Oct 30 2010 Sat
 Nov 06 2010 Sat | Nov 06 2010 Sat
 Nov 07 2010 Sun | Nov 06 2010 Sat
 Nov 08 2010 Mon | Nov 06 2010 Sat
 Nov 09 2010 Tue | Nov 06 2010 Sat
 Nov 10 2010 Wed | Nov 06 2010 Sat
 Nov 11 2010 Thu | Nov 06 2010 Sat
 Nov 12 2010 Fri | Nov 06 2010 Sat
 Nov 13 2010 Sat | Nov 13 2010 Sat
 Nov 14 2010 Sun | Nov 13 2010 Sat
 Nov 15 2010 Mon | Nov 13 2010 Sat
 Nov 16 2010 Tue | Nov 13 2010 Sat
 Nov 17 2010 Wed | Nov 13 2010 Sat
 Nov 18 2010 Thu | Nov 13 2010 Sat
 Nov 19 2010 Fri | Nov 13 2010 Sat
 Nov 20 2010 Sat | Nov 20 2010 Sat
 Nov 21 2010 Sun | Nov 20 2010 Sat
 Nov 22 2010 Mon | Nov 20 2010 Sat
(18 rows)

...query:

select to_char(z.d, 'Mon dd yyyy Dy') as base_date, 
       to_char( date_previous_dow(z.d, 6), 'Mon dd yyyy Dy') 
           as previous_saturday
from
(
     select 'Nov 5 2010'::date + x.n as d
     from generate_series(0,17) as x(n)
) as z

Happy new year folks!

Related to: Finding previous day of the week

Tuesday, December 21, 2010

Custom Sort Order on Postgresql

Given the countries database, how to place US on top of the list?

Click to expand countries database:

create table countries
(
country text not null unique
);

create index on countries(country);


insert into countries values
('Afghanistan'),
('Albania'),
('Algeria'),
('American Samoa'),
('Andorra'),
('Angola'),
('Anguilla'),
('Antarctica'),
('Antigua and Barbuda'),
('Argentina'),
('Armenia'),
('Aruba'),
('Australia'),
('Austria'),
('Azerbaijan'),
('Bahamas'),
('Bahrain'),
('Bangladesh'),
('Barbados'),
('Belarus'),
('Belgium'),
('Belize'),
('Benin'),
('Bermuda'),
('Bhutan'),
('Bolivia'),
('Bosnia and Herzegovina'),
('Botswana'),
('Bouvet Island'),
('Brazil'),
('British Indian Ocean Territory'),
('Brunei Darussalam'),
('Bulgaria'),
('Burkina Faso'),
('Burundi'),
('Cambodia'),
('Cameroon'),
('Canada'),
('Cape Verde'),
('Cayman Islands'),
('Central African Republic'),
('Chad'),
('Chile'),
('China'),
('Christmas Island'),
('Cocos (Keeling) Islands'),
('Colombia'),
('Comoros'),
('Congo'),
('Congo, The Democratic Republic of The'),
('Cook Islands'),
('Costa Rica'),
('Cote D''ivoire'),
('Croatia'),
('Cuba'),
('Cyprus'),
('Czech Republic'),
('Denmark'),
('Djibouti'),
('Dominica'),
('Dominican Republic'),
('Easter Island'),
('Ecuador'),
('Egypt'),
('El Salvador'),
('Equatorial Guinea'),
('Eritrea'),
('Estonia'),
('Ethiopia'),
('Falkland Islands (Malvinas)'),
('Faroe Islands'),
('Fiji'),
('Finland'),
('France'),
('French Guiana'),
('French Polynesia'),
('French Southern Territories'),
('Gabon'),
('Gambia'),
('Georgia'),
('Germany'),
('Ghana'),
('Gibraltar'),
('Greece'),
('Greenland'),
('Grenada'),
('Guadeloupe'),
('Guam'),
('Guatemala'),
('Guinea'),
('Guinea-bissau'),
('Guyana'),
('Haiti'),
('Heard Island and Mcdonald Islands'),
('Honduras'),
('Hong Kong'),
('Hungary'),
('Iceland'),
('India'),
('Indonesia'),
('Iran'),
('Iraq'),
('Ireland'),
('Israel'),
('Italy'),
('Jamaica'),
('Japan'),
('Jordan'),
('Kazakhstan'),
('Kenya'),
('Kiribati'),
('Korea, North'),
('Kosovo'),
('Kuwait'),
('Kyrgyzstan'),
('Laos'),
('Latvia'),
('Lebanon'),
('Lesotho'),
('Liberia'),
('Libyan Arab Jamahiriya'),
('Liechtenstein'),
('Lithuania'),
('Luxembourg'),
('Macau'),
('Macedonia'),
('Madagascar'),
('Malawi'),
('Malaysia'),
('Maldives'),
('Mali'),
('Malta'),
('Marshall Islands'),
('Martinique'),
('Mauritania'),
('Mauritius'),
('Mayotte'),
('Mexico'),
('Micronesia, Federated States of'),
('Moldova, Republic of'),
('Monaco'),
('Mongolia'),
('Montenegro'),
('Montserrat'),
('Morocco'),
('Mozambique'),
('Myanmar'),
('Namibia'),
('Nauru'),
('Nepal'),
('Netherlands'),
('Netherlands Antilles'),
('New Caledonia'),
('New Zealand'),
('Nicaragua'),
('Niger'),
('Nigeria'),
('Niue'),
('Norfolk Island'),
('Northern Mariana Islands'),
('Norway'),
('Oman'),
('Pakistan'),
('Palau'),
('Palestinian Territory'),
('Panama'),
('Papua New Guinea'),
('Paraguay'),
('Peru'),
('Philippines'),
('Pitcairn'),
('Poland'),
('Portugal'),
('Puerto Rico'),
('Qatar'),
('Reunion'),
('Romania'),
('Russia'),
('Rwanda'),
('Saint Helena'),
('Saint Kitts and Nevis'),
('Saint Lucia'),
('Saint Pierre and Miquelon'),
('Saint Vincent and The Grenadines'),
('Samoa'),
('San Marino'),
('Sao Tome and Principe'),
('Saudi Arabia'),
('Senegal'),
('Serbia and Montenegro'),
('Seychelles'),
('Sierra Leone'),
('Singapore'),
('Slovakia'),
('Slovenia'),
('Solomon Islands'),
('Somalia'),
('South Africa'),
('South Georgia and The South Sandwich Islands'),
('Spain'),
('Sri Lanka'),
('Sudan'),
('Suriname'),
('Svalbard and Jan Mayen'),
('Swaziland'),
('Sweden'),
('Switzerland'),
('Syria'),
('Taiwan'),
('Tajikistan'),
('Tanzania, United Republic of'),
('Thailand'),
('Timor-leste'),
('Togo'),
('Tokelau'),
('Tonga'),
('Trinidad and Tobago'),
('Tunisia'),
('Turkey'),
('Turkmenistan'),
('Turks and Caicos Islands'),
('Tuvalu'),
('Uganda'),
('Ukraine'),
('United Arab Emirates'),
('United Kingdom'),
('United States'),
('United States Minor Outlying Islands'),
('Uruguay'),
('Uzbekistan'),
('Vanuatu'),
('Vatican City'),
('Venezuela'),
('Vietnam'),
('Virgin Islands, British'),
('Virgin Islands, U.S.'),
('Wallis and Futuna'),
('Western Sahara'),
('Yemen'),
('Zambia'),
('Zimbabwe');

In Postgres, false sorts first, true sorts last

select n from (values(true),(false)) as x(n) order by n

n 
---
 f
 t
(2 rows)

So here's how to sort non-US last (which essentially sorts US first):

select country from countries order by country <> 'United States', country

If you want to use it on other RDBMS, use CASE WHEN yourConditionHere and use integer:

select country from countries 
order by 

 case when country = 'United States' then 1 else 2 end
 
 ,country

If you want to sort two countries first, use CASE yourFieldNameHere:

select country from countries 
order by 
 case country 
 when 'United States' then 1 
 when 'Philippines' then 2
 else 3 
 end
 
 ,country

Keyword search: Postgres sort order custom

Monday, December 20, 2010

ANY clause is superior to IN clause. At least on Postgresql :-)

Why is ANY clause superior to IN clause? Postgresql's ANY accepts array, which in turn can be manipulated further.

To illustrate the point, you cannot do this with IN expression:

select * from y 
where lower(email) IN 
    lower( ('ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com') );

Neither this:

select * from y 
where lower(email) 
    IN ( lower ('ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com') );

But with ANY clause, you can manipulate the array before passing values to it, a trick you cannot do on IN clause:

create function lower(t text[]) returns text[]
as
$$
select lower($1::text)::text[]
$$ language sql;
 
 
select * from y 
where lower(email) = 
    ANY( lower(ARRAY['ISteve.Jobs@Apple.com'
                     ,'Linus.Torvalds@Linux.com']) );

Sunday, December 19, 2010

Postgresql case-insensitive design and searching choice

First, what not to do, don't use ilike...

create table y
(
id serial not null,
email text not null unique
);

insert into y(email) 
values('iSteve.jobs@apple.com') ,('linus.Torvalds@linUX.com');
insert into y(email) 
select n from generate_series(1,1000) as i(n);
create index ix_y on y(email);

explain select * from y 
where email ilike 
    ANY(ARRAY['ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com']);

...,as that would result to Sequential Scan:

memdb=# explain select * from y where email ilike ANY(ARRAY['ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com']);
                                       QUERY PLAN                                       
----------------------------------------------------------------------------------------
 Seq Scan on y  (cost=0.00..17.52 rows=1 width=7)
   Filter: (email ~~* ANY ('{ISteve.Jobs@Apple.com,Linus.Torvalds@Linux.com}'::text[]))
(2 rows)

It's either you create an indexed lower expression...

create function lower(t text[]) returns text[]
as
$$
select lower($1::text)::text[]
$$ language sql;

create unique index ix_y_2 on y(lower(email));

explain select * from y 
where lower(email) = 
    ANY(lower(ARRAY['ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com']));

...which properly uses index:

memdb=# explain select * from y where lower(email) = ANY(lower(ARRAY['ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com']));
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on y  (cost=22.60..27.98 rows=10 width=7)
   Recheck Cond: (lower(email) = ANY ((lower(('{ISteve.Jobs@Apple.com,Linus.Torvalds@Linux.com}'::text[])::text))::text[]))
   ->  Bitmap Index Scan on ix_y_2  (cost=0.00..22.60 rows=10 width=0)
         Index Cond: (lower(email) = ANY ((lower(('{ISteve.Jobs@Apple.com,Linus.Torvalds@Linux.com}'::text[])::text))::text[]))
(4 rows)

Or you use citext data type...

create table x
(
id serial not null,
email citext not null unique
);

insert into x(email) 
values('iSteve.jobs@apple.com'),('linus.Torvalds@linUX.com');
insert into x(email) 
select n from generate_series(1,1000) as i(n);
create index ix_x on x(email);

explain select * from x 
where email = 
ANY(ARRAY['ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com']::citext[]);

...which properly uses index even you don't create an index on expression (e.g. create index zzz on yyy(lower(field))):

memdb=# explain select * from x where email = ANY(ARRAY['ISteve.Jobs@Apple.com','Linus.Torvalds@Linux.com']::citext[]);
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on x  (cost=8.52..12.75 rows=2 width=7)
   Recheck Cond: (email = ANY ('{ISteve.Jobs@Apple.com,Linus.Torvalds@Linux.com}'::citext[]))
   ->  Bitmap Index Scan on ix_x  (cost=0.00..8.52 rows=2 width=0)
         Index Cond: (email = ANY ('{ISteve.Jobs@Apple.com,Linus.Torvalds@Linux.com}'::citext[]))
(4 rows)

Happy Indexing!

Finding previous day of the week

This question in Stackoverflow...

In PostgreSQL 8.4, given a date, if that date is not a Friday, I would like to find the date of the previous Friday. Can someone tell me if there is an inbuilt function or give the logic behind getting my own function.

...has an accepted answer of:

SELECT 
    CASE 
-- 1. if Friday, return date
    WHEN EXTRACT(DOW FROM my_date) = 5 
    THEN my_date
-- 2. if Saturday, subtract 1
    WHEN EXTRACT(DOW FROM my_date) = 6 
    THEN my_date - INTERVAL '1 day'
-- 3. all other days of the week, subtract `DOW + 2` from my_date
    -- should be ELSE for future-proofing ;-) MB
    ELSE -- WHEN EXTRACT(DOW FROM my_date) < 5 THEN
        my_date - ((EXTRACT(DOW FROM my_date) + 2)::TEXT||'days')::INTERVAL
    END AS tgif
FROM 
    my_table
WHERE 
    my_date IS NOT NULL

That query is hard to modify if the requirement changes. What's wrong with that? For one, there is special handling of a corner case, i.e. 6, for Saturday. Second, there's a magic number in that query, which is number 2. And there is complication when the requirement changes, when you change the parameter to Thursday, what need to be changed? The obvious are: change the 6 and 5 to 5 and 4 respectively; what's the not so obvious is the number 2, where is that number derived? Should we increment it, or should we decrement it? Though in the defense of the SOer, one cannot introduce a variable outside of function in Postgresql.

So to make the intent more clear, here's his functionalized routine:

create function recent_day(_dow int, out src text, out dst text) returns setof record
as
$$

SELECT 
 to_char(your_date, 'Mon dd yyyy dy'),

 to_char(
     CASE 
 -- 1. if Friday, return date
     WHEN EXTRACT(DOW FROM your_date) = $1
     THEN your_date
 -- 2. if Saturday, subtract 1
     WHEN EXTRACT(DOW FROM your_date) =  $1 + 1 
     THEN your_date - INTERVAL '1 day'
 -- 3. all other days of the week, subtract `DOW + 2` from your_date
     ELSE -- WHEN EXTRACT(DOW FROM your_date) < 4 THEN 
  your_date - ((EXTRACT(DOW FROM your_date) + (7 - $1))::TEXT||'days')::INTERVAL
     END,
  'Mon dd yyyy dy')
      AS tgif

from
(
     select 'Nov 5 2010'::date + x.n as your_date
     from generate_series(0,17) as x(n)
) as x
$$ language 'sql';

Here's the output of that function:

memdb=# select * from recent_day(5);
       src       |       dst       
-----------------+-----------------
 Nov 05 2010 fri | Nov 05 2010 fri
 Nov 06 2010 sat | Nov 05 2010 fri
 Nov 07 2010 sun | Nov 05 2010 fri
 Nov 08 2010 mon | Nov 05 2010 fri
 Nov 09 2010 tue | Nov 05 2010 fri
 Nov 10 2010 wed | Nov 05 2010 fri
 Nov 11 2010 thu | Nov 05 2010 fri
 Nov 12 2010 fri | Nov 12 2010 fri
 Nov 13 2010 sat | Nov 12 2010 fri
 Nov 14 2010 sun | Nov 12 2010 fri
 Nov 15 2010 mon | Nov 12 2010 fri
 Nov 16 2010 tue | Nov 12 2010 fri
 Nov 17 2010 wed | Nov 12 2010 fri
 Nov 18 2010 thu | Nov 12 2010 fri
 Nov 19 2010 fri | Nov 19 2010 fri
 Nov 20 2010 sat | Nov 19 2010 fri
 Nov 21 2010 sun | Nov 19 2010 fri
 Nov 22 2010 mon | Nov 19 2010 fri
(18 rows)

So it's correct :-)

Now let's try Thursday :-)

memdb=# select * from recent_day(4);
       src       |       dst       
-----------------+-----------------
 Nov 05 2010 fri | Nov 04 2010 thu
 Nov 06 2010 sat | Oct 28 2010 thu
 Nov 07 2010 sun | Nov 04 2010 thu
 Nov 08 2010 mon | Nov 04 2010 thu
 Nov 09 2010 tue | Nov 04 2010 thu
 Nov 10 2010 wed | Nov 04 2010 thu
 Nov 11 2010 thu | Nov 11 2010 thu
 Nov 12 2010 fri | Nov 11 2010 thu
 Nov 13 2010 sat | Nov 04 2010 thu
 Nov 14 2010 sun | Nov 11 2010 thu
 Nov 15 2010 mon | Nov 11 2010 thu
 Nov 16 2010 tue | Nov 11 2010 thu
 Nov 17 2010 wed | Nov 11 2010 thu
 Nov 18 2010 thu | Nov 18 2010 thu
 Nov 19 2010 fri | Nov 18 2010 thu
 Nov 20 2010 sat | Nov 11 2010 thu
 Nov 21 2010 sun | Nov 18 2010 thu
 Nov 22 2010 mon | Nov 18 2010 thu
(18 rows)

Uh oh, something wrong. Look at Nov 20's recent Thursday, it should be November 18, yet in his query it is November 11. Lesson of the day, sometimes when you handle corner cases, it only handles corner cases very well ;-)

This is my solution(though unfortunately was not accepted):

create or replace function recent_day_mb(_dow int, out src text, out dst text) returns setof record
as
$$
select
 to_char(ds, 'Mon dd yyyy dy'), 

 to_char(

  ds - (extract(dow from ds) - $1)::int +

   case when extract(dow from ds) - $1 < 0 then -7 
   else 0 end

  ,'Mon dd yyyy dy') as recent_friday



from
(
     select 'Nov 5 2010'::date + x.n as your_date
     from generate_series(0,17) as x(n)
) as x
$$ language 'sql';

Let's try Friday

memdb=# select * from recent_day_mb(5);
       src       |       dst       
-----------------+-----------------
 Nov 05 2010 fri | Nov 05 2010 fri
 Nov 06 2010 sat | Nov 05 2010 fri
 Nov 07 2010 sun | Nov 05 2010 fri
 Nov 08 2010 mon | Nov 05 2010 fri
 Nov 09 2010 tue | Nov 05 2010 fri
 Nov 10 2010 wed | Nov 05 2010 fri
 Nov 11 2010 thu | Nov 05 2010 fri
 Nov 12 2010 fri | Nov 12 2010 fri
 Nov 13 2010 sat | Nov 12 2010 fri
 Nov 14 2010 sun | Nov 12 2010 fri
 Nov 15 2010 mon | Nov 12 2010 fri
 Nov 16 2010 tue | Nov 12 2010 fri
 Nov 17 2010 wed | Nov 12 2010 fri
 Nov 18 2010 thu | Nov 12 2010 fri
 Nov 19 2010 fri | Nov 19 2010 fri
 Nov 20 2010 sat | Nov 19 2010 fri
 Nov 21 2010 sun | Nov 19 2010 fri
 Nov 22 2010 mon | Nov 19 2010 fri
(18 rows)

It's correct

Now let's try Thursday:

memdb=# select * from recent_day_mb(4);
       src       |       dst       
-----------------+-----------------
 Nov 05 2010 fri | Nov 04 2010 thu
 Nov 06 2010 sat | Nov 04 2010 thu
 Nov 07 2010 sun | Nov 04 2010 thu
 Nov 08 2010 mon | Nov 04 2010 thu
 Nov 09 2010 tue | Nov 04 2010 thu
 Nov 10 2010 wed | Nov 04 2010 thu
 Nov 11 2010 thu | Nov 11 2010 thu
 Nov 12 2010 fri | Nov 11 2010 thu
 Nov 13 2010 sat | Nov 11 2010 thu
 Nov 14 2010 sun | Nov 11 2010 thu
 Nov 15 2010 mon | Nov 11 2010 thu
 Nov 16 2010 tue | Nov 11 2010 thu
 Nov 17 2010 wed | Nov 11 2010 thu
 Nov 18 2010 thu | Nov 18 2010 thu
 Nov 19 2010 fri | Nov 18 2010 thu
 Nov 20 2010 sat | Nov 18 2010 thu
 Nov 21 2010 sun | Nov 18 2010 thu
 Nov 22 2010 mon | Nov 18 2010 thu
(18 rows)

See the Nov 20, its recent Thursday is November 18, not November 11. Works well :-)

Happy querying!

[EDIT]

My most succint solution:

create or replace function previous_date_of_day(the_date date, dow int) returns date
as
$$
select 
    case when extract(dow from $1) < $2 then
        $1 - ( extract(dow from $1) + (7 - $2) )::int 
    else
        $1 - ( extract(dow from $1) - $2)::int
    end;
$$ language 'sql';


select to_char(z.ds, 'Mon dd yyyy dy') as source, 
       to_char( previous_date_of_day(z.ds, 5), 'Mon dd yyyy dy') as dest
from
(
     select 'Nov 5 2010'::date + x.n as ds
     from generate_series(0,17) as x(n)
) as z

[EDIT]

The asker(Rabin) already changed the accepted answer to mine. Thanks Rabin for acknowledging code correctness! :-)

[EDIT 2010-12-04]

Here's the other related functions, the function previous_date_of_day in this article is simplifed to date_previous_dow, check here : http://www.ienablemuch.com/2010/12/date-functions.html

Thursday, December 16, 2010

Postgresql dec to hex

test=# select to_hex(12648430) as favorite_drink, to_hex(3405697037) as person_type;
 favorite_drink | person_type
----------------+-------------
 c0ffee         | cafed00d
(1 row)

Brownfield system problem on Fluent NHibernate. And solution

If you have these tables:

create table Person
(
Id int identity(1,1) not null primary key,
PersonName varchar(255) not null,
BirthDate Date not null
);

create table Contact
(
OwnerId int not null references Person(Id),
Id int identity(1,1) not null primary key,
Number varchar(50) not null,
Type varchar(20) not null
);

Click to expand mapping

public class Person
    {
        public virtual int Id { get; set; }
        
        public virtual string PersonName { get; set; }
        public virtual DateTime Birthdate { get; set; }        
                    
        public virtual IList<Contact> Contacts { get; set; }
    }
    

    public class Contact
    {
        public virtual Person Owner { get; set; }
                        
        public virtual int Id { get; set; }        
        public virtual string Number { get; set; }        
        public virtual string Type { get; set; }


        
        
    }
    
    public class PersonMap : ClassMap <Person>
    {
        public PersonMap()
        {        
    
            Id(x => x.Id);
            Map(x => x.PersonName).Not.Nullable();
            Map(x => x.Birthdate).Not.Nullable();
            HasMany(x => x.Contacts).Inverse(); 
        }
    }


  
    public class ContactMap : ClassMap <Contact>
    {
        public ContactMap()
        {
            References(x => x.Owner);
            Id(x => x.Id);
            Map(x => x.Number).Not.Nullable();
            Map(x => x.Type).Not.Nullable();
        }
        
    }

This code...

ISessionFactory sf = 
        Fluently.Configure()
            .Database(MsSqlConfiguration.MsSql2008.ConnectionString(@"Data Source=.\SQLExpress;Initial Catalog=test3;Trusted_Connection=true"))
            .Mappings(m =>
                m.FluentMappings.AddFromAssemblyOf<MainClass>()
                .Conventions.Add(ForeignKey.EndsWith("Id"))                        
                )
            .BuildConfiguration()
            .BuildSessionFactory();

    var px = from p in sf.OpenSession().Query<Person>()
                select p;
         


    foreach (var p in px)
    {
        Console.WriteLine("{0} {1}", p.PersonName, p.Birthdate);

        foreach (var c in p.Contacts)
        {
            Console.WriteLine("{0}", c.Number);
        }
    }

...will produce an incorrect query on foreach(var c in p.Contacts) (note the extraneous PersonId):

SELECT 
contacts0_.Person_id as Person5_1_, 
contacts0_.Id as Id1_, 
contacts0_.Id as Id2_0_, 
contacts0_.Number as Number2_0_, 
contacts0_.Type as Type2_0_, 
contacts0_.OwnerId as OwnerId2_0_ 
FROM [Contact] contacts0_ 
WHERE contacts0_.PersonId=?

Even if you add this:

  
.Conventions.Add(
    ConventionBuilder.Reference
      .Always(x => x.Column(x.Name+ "Id" )))

The only way to rectify this problem is to set up the joining key(e.g. OwnerId) on both ends. That is, we must set it up on .HasMany.Always and .References.Always conventions

But first, we must design a nice API solution around this problem, this is what I came up with:

.Conventions.AddBrownfieldConventions(referenceSuffix: "Id");

And that works! :-) When your run the Linq code above, it will work now. Expand to check the code:

using System;

using System.Collections.Generic;
using System.Text.RegularExpressions;

using NHibernate;
using NHibernate.Dialect;

using FluentNHibernate.Cfg;
using FluentNHibernate.Cfg.Db;
using FluentNHibernate.Conventions;
using FluentNHibernate.Conventions.Helpers;
using FluentNHibernate.Mapping;
using System.Reflection;
using System.Collections;
using FluentNHibernate.Mapping.Providers;


using System.Linq;
using FluentNHibernate.BrownfieldSystem;
using FluentNHibernate.Conventions.Instances;

namespace FluentNHibernate.BrownfieldSystem
{
    public class ClassMapExt<T> : ClassMap<T>
    {
        public IList<IManyToOneMappingProvider> ExtReferences { get { return this.references; } }
    }

    public static class BrownfieldSystemHelper
    {


        public static T AddBrownfieldConventions<T>(this SetupConventionFinder<T> fluentMappingsContainer, string referenceSuffix, params IConvention[] otherConventions)
        {
            return fluentMappingsContainer.AddBrownfieldConventions(referenceSuffix, false, otherConventions);
        }

        public static T AddBrownfieldConventions<T>(this SetupConventionFinder<T> fluentMappingsContainer, string referenceSuffix, bool toLowercase, params IConvention[] otherConventions)
        {

            IList<IConvention> brown =
                new IConvention[]
                {
                    Table.Is(x => x.EntityType.Name.ToLowercaseNamingConvention(toLowercase))
                    ,ConventionBuilder.Property.Always(x => x.Column(x.Name.ToLowercaseNamingConvention(toLowercase)))
                    ,ConventionBuilder.Id.Always( x => x.Column(x.Name.ToLowercaseNamingConvention(toLowercase)) )        
                    ,ConventionBuilder.HasMany.Always(x => x.Key.Column( x.NormalizeReference().ToLowercaseNamingConvention(toLowercase) + referenceSuffix )  )
                
                    // Instead of this...
                    // ,ForeignKey.EndsWith(referenceSuffix)                                
                    // ... we do this, so we have direct control on Reference name's casing:
                    ,ConventionBuilder.Reference.Always(x => x.Column( x.Name.ToLowercaseNamingConvention(toLowercase) + referenceSuffix ) )
                
                };

            foreach (IConvention c in brown)
                fluentMappingsContainer.Add(c);

            return fluentMappingsContainer.Add(otherConventions);
        }


        public static string ToLowercaseNamingConvention(this string s)
        {
            return s.ToLowercaseNamingConvention(false);
        }

        public static string ToLowercaseNamingConvention(this string s, bool toLowercase)
        {
            if (toLowercase)
            {
                var r = new Regex(@"
                (?<=[A-Z])(?=[A-Z][a-z]) |
                 (?<=[^A-Z])(?=[A-Z]) |
                 (?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);

                return r.Replace(s, "_").ToLower();
            }
            else
                return s;
        }


        public static string NormalizeReference(this IOneToManyCollectionInstance x)
        {
            foreach (Type t in x.ChildType.Assembly.GetTypes())
            {
                if (t.BaseType.IsGenericType)
                    if (t.BaseType.GetGenericTypeDefinition() == typeof(ClassMapExt<>))
                        if (t.BaseType.GetGenericArguments()[0] == x.ChildType) // e.g.Contact
                        {

                            var pz = Activator.CreateInstance(t); // ContactMapExt                                        
                            var extRef = (IList<IManyToOneMappingProvider>)t.InvokeMember("ExtReferences", BindingFlags.GetProperty, null, pz, null);

                            if (extRef.Count > 1) 
                                throw new Exception("Ambiguous collection found. Do explicit column mapping on both entity and value table");
                            foreach (IManyToOneMappingProvider imt1 in extRef)
                                return imt1.GetManyToOneMapping().Member.Name;
                        }
            }

            return "";

        }//Normalize


    }// class BrownfieldSystemHelper
}// namespace FluentNHibernate.BrownfieldSystem

What you need to set up in your program is to change all brownfield tables' ClassMap to ClassMapExt. ClassMapExt exposes the this.references(which by James Gregory's design is protected on ClassMap, the references contain the data structure for ManyToOne mapping), so we can use it in ConventionBuilder.HasMany.Always

To verify that our code is indeed working, this Fluently configuration(sans Brownfield) will not output the correct DDL...

string sx = string.Join(";" + Environment.NewLine,
  Fluently.Configure()
  .Database(MsSqlConfiguration.MsSql2008)
  .Mappings(m => 
   m.FluentMappings
   .AddFromAssemblyOf<MainClass>()
   .Conventions.Add(ForeignKey.EndsWith("Id"))
   )
  .BuildConfiguration()
  .GenerateSchemaCreationScript(new MsSql2008Dialect())
  );


Console.WriteLine("Copy this: \n\n{0}", sx);

..., note the extraneous PersonId :

create table [Contact] 
(
Id INT IDENTITY NOT NULL, 
Number NVARCHAR(255) not null, 
Type NVARCHAR(255) not null, 
OwnerId INT null, 
PersonId INT null, 
primary key (Id)
);

create table [Person] 
(
Id INT IDENTITY NOT NULL, 
PersonName NVARCHAR(255) not null, 
Birthdate DATETIME not null, 
primary key (Id)
);

alter table [Contact] add constraint FK4FF8F4B2F8EC9C3E 
foreign key (OwnerId) references [Person];

alter table [Contact] add constraint FK4FF8F4B2141D8C21 
foreign key (PersonId) references [Person]

While this one with adding of Brownfield conventions...

string sx = string.Join(";" + Environment.NewLine,
        Fluently.Configure()
        .Database(MsSqlConfiguration.MsSql2008)
        .Mappings(m => 
            m.FluentMappings
            .AddFromAssemblyOf<MainClass>()
            .Conventions.AddBrownfieldConventions(referenceSuffix: "Id")                        
            )
        .BuildConfiguration()
        .GenerateSchemaCreationScript(new MsSql2008Dialect()));


Console.WriteLine("Copy this: \n\n{0}", sx);

...yields correct DDL (note that there's no more extraneous PersonId):

create table Contact 
(
Id INT IDENTITY NOT NULL, 
Number NVARCHAR(255) not null, 
Type NVARCHAR(255) not null, 
OwnerId INT null, 
primary key (Id)
);

create table Person 
(
Id INT IDENTITY NOT NULL, 
PersonName NVARCHAR(255) not null, 
Birthdate DATETIME not null, 
primary key (Id)
);

alter table Contact add constraint FK4FF8F4B2F8EC9C3E 
foreign key (OwnerId) references Person

And the correct query(note that it's using OwnerId now, and no more extraneous Person_id):

SELECT 
contacts0_.OwnerId as OwnerId1_, 
contacts0_.Id as Id1_, 
contacts0_.Id as Id3_0_, 
contacts0_.Number as Number3_0_, 
contacts0_.Type as Type3_0_, 
contacts0_.OwnerId as OwnerId3_0_ 
FROM Contact contacts0_ 
WHERE contacts0_.OwnerId=?

Happy Fluenting! :-)

EDIT: 2010-12-18

Program update on flaw found by Stefan Steinegger on multiple references: http://www.ienablemuch.com/2010/12/intelligent-brownfield-mapping-system.html

Postgresql Speeding Up Coalesce

Using this:

create table x as 
select * from generate_series(1,1000) as x(n);

insert into x values(null);

If we perform a query such as this one:

select * from x order by coalesce(n,0)

We will receive Seq Scan query plan:

QUERY PLAN                         
-----------------------------------------------------------
 Sort  (cost=64.90..67.40 rows=1001 width=4)
   Sort Key: (COALESCE(n, 0))
   ->  Seq Scan on x  (cost=0.00..15.01 rows=1001 width=4)
(3 rows)

To optimize that coalesce, we should make an index on coalesce expression:

create index ix_x on x(coalesce(n,0));

And that will produce Index Scan query plan:

QUERY PLAN                            
------------------------------------------------------------------
 Index Scan using ix_x on x  (cost=0.00..55.27 rows=1001 width=4)
(1 row)

But the problem with using sentinel approach on that kind of coalesce, if the sentinel changes, your database won't use the index you've created, so this...

select * from x order by coalesce(n,-1);

...produces Seq Scan query plan again:

QUERY PLAN                         
-----------------------------------------------------------
 Sort  (cost=64.90..67.40 rows=1001 width=4)
   Sort Key: (COALESCE(n, (-1)))
   ->  Seq Scan on x  (cost=0.00..15.01 rows=1001 width=4)
(3 rows)

On that simple example, since Postgresql has a first class support for sorting nulls first, we could just index on that expression directly, let's check first what's the execution plan of the following query without the index...

select * from x order by n nulls first;

...that produces:

QUERY PLAN                         
-----------------------------------------------------------
 Sort  (cost=64.90..67.40 rows=1001 width=4)
   Sort Key: n
   ->  Seq Scan on x  (cost=0.00..15.01 rows=1001 width=4)
(3 rows)

Now let's create an index on x's n...

create index ix_x_y on x(n nulls first);

...then the last query (select * from x order by n nulls first;), will produce:

QUERY PLAN                             
--------------------------------------------------------------------
 Index Scan using ix_x_y on x  (cost=0.00..43.27 rows=1001 width=4)
(1 row)

That's more neat than using COALESCE for sorting nulls

Related: http://www.ienablemuch.com/2010/04/how-to-speed-up-order-by.html

Monday, December 13, 2010

NHibernate 3 + Fluent + Linq = One Stop Shop application development

Do you see yourself years from now, doing repetitive coding, still trying to remember each and every attributes and format of a not-so-compelling XML configuration files? Trying to remember each and every Attributes that needed be attached to your class and its properties? Writing countless simple stored procedures? Neither do I

We are lucky in our time, that the confluence of many programming technologies is enabling us to write verifiable code, a code that can be verified just by reading the er.. code, no more XML, no more magic strings, no need to switch between environments/contexts incessantly, codes that are refactoring-friendly. These technologies also reduces us of mental context-switching

With the introduction of Linq in NHibernate 3, fluent and lambda-based mapping technique(courtesy of Fluent NHibernate), the future is here now. With these technologies, one-stop-shop programming with C# holds more true today than ever before, it's all in the source Luke.

Another one-stop-shop coding in a one-stop-shop coding of FNH is its ConventionBuilder(a topic we will cover on next post), if you have repetitive mappings that follows some patterns/rules and it's a bummer to adhere to them manually on each and every mapping, you can place them in one centralized location, in ConventionBuilders; no more scattered mappings, no more manual typing. And what does Convention-based mappings have that Attributes-based(e.g. Linq to SQL) ones don't have? The latter is not programmable and is scattered on your code.

Following is the simplest code. We omit the repository pattern in our code, yet it maintains its 'real world'-ness, in particular we have prevention mechanism in place for accidentally building a SessionFactory twice. Here's the simplest implementation of NHibernate + Fluent NHibernate + Linq:

using System;
using System.Linq;

using NHibernate;
using NHibernate.Cfg;
using NHibernate.Linq;

/* 
* a bit amused :-) James Gregory has a good program organization,
* he doesn't stash the Fluent NHibernate core functionalities to FluentNHibernate namespace
*/
// using FluentNHibernate; 
using FluentNHibernate.Cfg;
using FluentNHibernate.Cfg.Db;
using FluentNHibernate.Mapping;
using FluentNHibernate.Conventions.Helpers;



namespace SimpleFnhLinq
{
    
    class MainClass
    {
        public static void Main (string[] args)
        {
            var sess = SessionFactoryBuilder.GetSessionFactory().OpenSession();
                        
            
            Action listProduct =  delegate
            {
                foreach(var p in sess.Query<Product>())
                {
                    Console.WriteLine("{0}, {1}, {2}, {3}", p.ProductID, p.Name, p.Category, p.Price);
                }
                    
            };
            

            
            
            Console.WriteLine("\nINSERT: ");
            var nh = new Product { Name = "NHibernate", Description = "Awesome", Category = "ORM", Price = 1234567 };
            sess.Save(nh); // buffer it in memory
            sess.Flush(); // actual database act
            listProduct();
            
            
            Console.WriteLine("\nUPDATE: ");
            nh.Name = "NHibernate+Linq";
            sess.Save(nh);
            sess.Flush();             
            listProduct();
            
            
                    
            Console.WriteLine("\nINSERT: ");
            sess.Save(new Product { Name = "Fluent NHibernate", Description = "Bee's Knees", Category = "Friend", Price = 8901234 } );
            sess.Flush();
            listProduct();
            
            
            
            // NHibernate 3 has Linq support
            var q = from p in sess.Query<Product>()
                    where p.Category == "ORM"
                    select p;
            
            
            Console.WriteLine("\n\nList of ORM: ");
            foreach(var p in q)
            {
                Console.WriteLine("{0}, {1}, {2}, {3}", p.ProductID, p.Name, p.Category, p.Price);
            }

            

            Console.WriteLine("\n\nRecords count: {0}", sess.Query<Product>().Count());
            Console.ReadLine();
        }
    }//MainClass


    #region POCO...

    public class Product 
    { 
        public virtual int ProductID { get; set; }
        public virtual string Name { get; set; }
        public virtual string Description { get; set; }
        public virtual decimal Price { get; set; }
        public virtual string Category { get; set; }
    }

    #endregion ...POCO


    #region Mappings...


    public class ProductMap : ClassMap<Product>
    {
        public ProductMap()
        {        
            Id(x => x.ProductID).GeneratedBy.HiLo("1");
            Map(x => x.Name);
            Map(x => x.Description);
            Map(x => x.Category);
            Map(x => x.Price);            
        }
    }
    
    #endregion ...Mappings



    public static class SessionFactoryBuilder
    {
        static ISessionFactory _sf = null;
        public static ISessionFactory GetSessionFactory()
        {
            // Building SessionFactory is costly, should be done only once, making the backing variable static would prevent creation of multiple factory


            if (_sf != null) return _sf;

            var configurer = new IPersistenceConfigurer[]
                {
                    PostgreSQLConfiguration.Standard
                        .ConnectionString("Server=localhost;Database=store;User ID=postgres;Password=opensesame;"),
                    MsSqlConfiguration.MsSql2008
                        .ConnectionString(@"Data Source=.\SQLExpress;Initial Catalog=store;Trusted_Connection=true")
                };

            var dialects = new NHibernate.Dialect.Dialect[]
            {
                new NHibernate.Dialect.PostgreSQLDialect(),
                new NHibernate.Dialect.MsSql2008Dialect()                
            };

            int n = 1; // Sql Server

            _sf = Fluently.Configure()
                .Database(configurer[n])
                .Mappings
                    (m =>
                        m.FluentMappings.AddFromAssemblyOf<MainClass>()
                        .Conventions.Add(ConventionBuilder.Class.Always(x => x.Table(x.TableName.ToLower())))  // Postgresql Compatibility
                    )
                /*.ExposeConfiguration
                    (x => 
                        { 
                        // If you want don't want to create the script yourself copy the string from here and paste it to your database admin tool
                            string ddl = string.Join("; ", x.GenerateSchemaCreationScript(dialects[n]));
                        Console.WriteLine( "{0}", ddl );
                        Console.ReadLine();
                        }
                    )*/
                .BuildSessionFactory();

            return _sf;
        }
    }//SessionFactoryBuilder
    
                
}//namespace

Here's the DDL, compatible to Postgresql and Sql Server

CREATE TABLE Product
(
  ProductId int NOT NULL primary key,
  Name varchar(100) NOT NULL,
  Description varchar(500) NOT NULL,
  Category varchar(50) NOT NULL,
  Price numeric NOT NULL
);


CREATE TABLE hibernate_unique_key
(
  next_hi integer
);

insert into hibernate_unique_key values(1);

The components needed, add this to your project's References:

fluentnhibernate-NH3.0-binary-1.2.0.694\FluentNHibernate.dll

NHibernate-3.0.0.GA-bin\Required_Bins\NHibernate.dll

NHibernate-3.0.0.GA-bin\Required_For_LazyLoading\Castle\NHibernate.ByteCode.Castle.dll

Saturday, December 11, 2010

Postgresql said, Sql Server(2008) said. Non-Unique Nulls, Unique Nulls

Postgres said:

create table x
(
name varchar(50) null unique
);

insert into x values('John');
insert into x values('Paul');
insert into x values(null); -- will allow this
insert into x values(null); -- will also allow this. Postgres allow duplicate nulls, it only indexes non-nulls. Compliant to ANSI SQL specifcation

How to emulate that in Sql Server 2008?

create table x
(
name varchar(50) null -- don't put unique here
)


create unique index uk_x on x(name) where name is not null -- this index creation, covers non-nulls only

insert into x values('John')
insert into x values('Paul')
insert into x values(null) -- will allow this
insert into x values(null) -- will also allow this

Sql Server 2008 said:

create table x
(
name varchar(50) null unique 
)


insert into x values('John')
insert into x values('Paul')
insert into x values(null)
insert into x values(null) -- this won't be allowed, Sql Server's unique column cover all values(non-nulls and nulls)

How to emulate that in Postgres?

create table x
(
name varchar(50) null unique -- keep this
);

-- Then we resolve all null to single value, e.g. number 1. 
-- essentially preventing duplicate nulls
create unique index uk_x on x (cast(1 as int)) where name is null; 


insert into x values('John');
insert into x values('Paul');
insert into x values(null);
insert into x values(null); -- this won't be allowed

Tuesday, December 7, 2010

Return results from anonymous code block on Postgresql

One of the nice things about MS Sql Server procedural language is that it doesn't support multiple languages, T-SQL only, and as such it can facilitate on-the-fly testing of code logic:

declare @person_id int;

set @person_id = 1;

select * from person where person_id = @person_id;

On Postgres 9.0, though it already supports anonymous code block, it's constrained by the fact that the anonymous code block is encapsulated in an invisible void function(Postgres supports many procedural languages, not just its own flavor of procedural language(PL/pgSQL), e.g. Python, Java, Perl, LOLCODE, etc), well you can think of it that way. Hence displaying query results on anonymous code block won't work out-of-the-box on Postgres. This won't work:

do
$$
declare 
 _person_id int; 
begin
 _person_id = 1;

 select * from person
 where person_id = _person_id;
end;
$$;

That will produce this error:

ERROR:  query has no destination for result data
HINT:  If you want to discard the results of a SELECT, use PERFORM instead.
CONTEXT:  PL/pgSQL function "inline_code_block" line 8 at SQL statement

********** Error **********

ERROR: query has no destination for result data
SQL state: 42601
Hint: If you want to discard the results of a SELECT, use PERFORM instead.
Context: PL/pgSQL function "inline_code_block" line 8 at SQL statement

Neither this will work:

do
$$
declare 
 _person_id int; 
begin
 _person_id = 1;

 return query select * from person
 where person_id = _person_id;
end;
$$;

That code will produce this error:

ERROR:  cannot use RETURN QUERY in a non-SETOF function
LINE 13:  return query select * from person where person_id = _person...
          ^

********** Error **********

ERROR: cannot use RETURN QUERY in a non-SETOF function
SQL state: 42804
Character: 170

So if you want to test logic on-the-fly on Postgres, create temporary table on anonymous code block:

do
$$
declare 
 _person_id int; 
begin
 _person_id = 1;

 drop table if exists _x;
 create temporary table _x as 
 select * from person
 where person_id = _person_id;
end;
$$;

select * from _x; -- the temporary table will continue to exist on individual Postgres session

Wish for Postgres, make this happen, to make adhoc-y stuff happen :-)

do returns table (person_id int, lastname text) as
$$
declare
 _person_id int; 
begin
 _person_id = 1;
 
 return query select person_id, lastname from person
 where person_id = _person_id;
end;
$$;

UPDATE: March 27, 2019

Made the code shorter:

do
$$
declare 
 _person_id int; 
begin
 _person_id = 1;

 create temporary table _x on commit drop
 as 
 select * from person
 where person_id = _person_id;
end;
$$;

select * from _x; -- the temporary table will continue to exist on individual Postgres session

With that code, there's no need to explicitly drop the table. Postgres has an implicit commit whenever queries are executed. on commit drop ensures that the temporary table will not stay in session longer than necessary.

Found out on commit drop here: https://stackoverflow.com/questions/22929365/create-a-temp-table-if-not-exists-for-use-into-a-custom-procedure/22932917#22932917

Friday, December 3, 2010

Unix transport error when unit testing NHibernate for Postgresql under Mono

If you happen to encounter the following error while doing unit testing for NHibernate for Postgresql under Mono...

Internal error
        RemotingException: Unix transport error.

...Change your Npgsql version to a Mono one, then that error won't happen.

That error will appear on unit testing if you are using MS .NET version of Npgsql (example: http://pgfoundry.org/frs/download.php/2868/Npgsql2.0.11-bin-ms.net4.0.zip) under Mono. When unit testing under Mono, you must use Mono version of Npgsql (example: http://pgfoundry.org/frs/download.php/2860/Npgsql2.0.11-bin-mono2.0.zip)

Weird problem, the error only appear when the code is run under Unit Testing(built-in in MonoDevelop). But when run independently, MS .NET version of Npgsql will run just fine under Mono. Anyway, to make matters simple, use Mono version of the component if you are building Mono stuff

Thursday, November 18, 2010

Simple sample for loading parent child records on NHibernate

Object-Relational:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using NHibernate.Cfg;
using System.Reflection;
using NHibernate;
using System.Threading;

namespace TestTwoTable
{
    class Program
    {
        static void Main(string[] args)
        {
                                
            if(1 == 1)
            {
                IList<Phone> phones = PhoneList();
    
    
                foreach(Phone p in phones)
                {                                        
                    Console.WriteLine("Phone {0}", p.PhoneNumber);
                    
                    
                    // Try to comment/uncomment Console.WriteLines so you can 
                    // verify that NHibernate indeed lazy loads related parent record,
                    // check your RDBMS log query(s) timestamps to verify that.
                    // 
                    // When using Postgresql on Linux, the log file is in /opt/Postgresql/9.0/data/pg_log
                    // On Mac it is in /Library/PostgreSQL/9.0/data/pg_log
                    // On Windows it is in C:\Program Files (x86)\PostgreSQL\9.0\data\pg_log                                    
                    Console.WriteLine("Owned by {0}", p.Kontact.ContactName);

                    
                    // With lazy loading, the p.Kontact's Phones will not be queried unnecessarily.
                    // The p.Kontact's Phones will only be queried from database when you access it.
                    // Try to comment/uncomment the following code and see that effect in your RDBMS log file.                    .
                    /*
                    foreach(Phone p in p.Kontact.Phones)
                    {
                        Console.WriteLine("Contact's phone {0}", p.PhoneNumber);
                    }*/
                    
                }                                
            }
            
            Console.WriteLine("*****");
            
            if(1 == 1)
            {
                IList<Contact> contacts = ContactList();
    
                foreach(Contact c in contacts)
                {                    
                    Console.WriteLine("\nContact {0}'s Phone(s): ", c.ContactName);                    
                    
                    // With lazy loading, the c's Phones will not be queried unnecessarily.
                    // The c's Phones will only be queried from database when you access it.
                    // Try to comment/uncomment the following code and see that effect in your RDBMS log file                    .
                    foreach(Phone p in c.Phones)
                    {
                        Console.WriteLine("Phone {0}", p.PhoneNumber);
                    }
                    
                    
                }
            }
            
            
        }

        static ISession OpenSession()
        {
            var c = new Configuration();
            c.AddAssembly(Assembly.GetCallingAssembly());
            ISessionFactory f = c.BuildSessionFactory();
            return f.OpenSession();
        }

        static IList<Phone> PhoneList()
        {
            ISession session = OpenSession();
        
            IQuery query = session.CreateQuery("from Phone p where p.Kontact.ContactName = :_contact_name");
            query.SetParameter("_contact_name", "lennon");
            
            IList<Phone> contacts = query.List<Phone>();

            return contacts;

        
        }
        
        static IList<Contact> ContactList()
        {
            ISession session = OpenSession();

            IQuery query = session.CreateQuery("from Contact");
            
            IList<Contact> contacts = query.List<Contact>();
            
            return contacts;
        }
 
    }

    public class Contact
    {
        public virtual int ContactId { get; set; }
        public virtual string ContactName { get; set; }
        
        public virtual IList<Phone> Phones { get; set; }
        

    }

    public class Phone
    {        
        public virtual Contact Kontact { get; set; }
        public virtual int PhoneId { set; get; }
        public virtual string PhoneNumber { get; set; }
    }


}

Mapping:

<?xml version="1.0" encoding="utf-8" ?>

<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" auto-import="true" assembly="TestTwoTable" namespace="TestTwoTable">

  <class name="Contact" table="contact">
    <id name="ContactId" column="contact_id">
      <generator class="sequence">
        <param name="sequence">contact_contact_id_seq</param>
      </generator>
    </id>

    <property name="ContactName" column="contact_name"/>
    
    <bag name="Phones" inverse="true">
        <key column="contact_id"/>
        <one-to-many class="Phone"/>
    </bag>

  </class>

  <class name="Phone" table="phone">
    <id name="PhoneId" column="phone_id">
      <generator class="sequence">
        <param name="sequence">phone_phone_id_seq</param>
      </generator>
    </id>

    <property name="PhoneNumber" column="phone_number"/>
    
    <many-to-one name="Kontact" column="contact_id" class="Contact" not-null="true"/>
    
  </class>
  
</hibernate-mapping>

The Database:

CREATE TABLE contact
(
  contact_id serial NOT NULL,
  contact_name text NOT NULL,
  CONSTRAINT contact_pkey PRIMARY KEY (contact_id)
);

CREATE TABLE phone
(
  contact_id integer NOT NULL,
  phone_id serial NOT NULL,
  phone_number text NOT NULL,
  CONSTRAINT phone_pkey PRIMARY KEY (phone_id),
  CONSTRAINT fk_phone__contact FOREIGN KEY (contact_id)
      REFERENCES contact (contact_id)
);


insert into contact(contact_name) values('lennon'),('mccartney');
insert into phone(contact_id, phone_number) values(1,'number nine'),(1,'number 10'),(2,'you know my name, look up the number');

Output:

Phone number nine
Owned by lennon
Phone number 10
Owned by lennon
*****

Contact lennon's Phone(s): 
Phone number nine
Phone number 10

Contact mccartney's Phone(s): 
Phone you know my name, look up the number

Monday, November 15, 2010

Restarting Postgresql service under OS X

sudo -u postgres /Library/PostgreSQL/9.0/bin/pg_ctl -D /Library/PostgreSQL/9.0/data restart

Open Source