USENET News article: Re: Recursion (was PROLOG in the "real" world?)

Newsgroup: comp.lang.prolog

Article 4164 of comp.lang.prolog:
Newsgroups:	comp.lang.prolog,comp.software-eng,comp.ai
Path:	icdoc!dds
From:	dds@doc.ic.ac.uk (Diomidis Spinellis)
Subject:	Re: Recursion (was PROLOG in the "real" world?)
Nntp-Posting-Host:	dirty.doc.ic.ac.uk
Message-ID:	<1991Oct30.210210.5377@doc.ic.ac.uk>
Organization:	Dept. of Computing, Imperial College, London, England
Date:	Wed, 30 Oct 1991 21:02:10 GMT
References:	<1991Oct27.164157.6429@cpsc.ucalgary.ca> <770@fudd.dataco.UUCP> <1991Oct29.193258.2656@cs.ubc.ca>
Lines:	166
Content-Length:	5194

In article <1991Oct29.193258.2656@cs.ubc.ca> kean@cs.ubc.ca (Alex Kean) writes:
>
>Since we are at the issue of Prolog being a useful PL, I have the
>following frustrating problem. How fast can Prolog (Any system)
>perform the following computation:
>
>    [Subsumption] Let S be a set of sets. Compute the set SUB(S) to be
>                  the maximal subset of S such that no member E of S
>                  is a subset of member E' of SUB(S), for E =/= E'.
The SB-Prolog system is about an order of magnitude slower than the numbers
you are giving.  This is however the least important thing.

>A naive Prolog program would look like this:
And it is wrong.  Try sub([[1,3],[4,5],[2,5],[1,2,3]],X).

>    sub([], []).
>    sub([First|Tail], NewTail) :- 
>      member(E, Tail),
                 ^^^^ Here you are taking elements from the rest of the
		      set.  You should be checking all the elements of
		      the set.  In order to do this you need to pass
		      the original set around as an extra parameter.
>      subset(E, First),
>      sub(Tail, NewTail).
>    sub([First|Tail], [First|NewTail]) :-
>      sub(Tail, NewTail).

Here is a correct naive version:

sub(X,  Y) :-
	subsume(X, X, Y).

subsume(_, [],  []).

subsume(S, [H | T],  [H | T2]) :-
	not is_subset_of_member(H, S),
	subsume(S, T, T2).

subsume(S, [H | T],  T2) :-
	subsume(S, T, T2).

is_subset_of_member(H, [X | T] ) :-
	H \= X,
	is_subset(X, H).

is_subset_of_member(H, [X | T] ) :-
	is_subset_of_member(H, T).


is_subset([], _ ).

is_subset([H | T], L ) :-
	member(H, L),
	is_subset(T, L).

>To test the efficiency of Prolog, I ran the above program with a list S such
>that S = SUB(S) as follows:
[...]
>Eventhough the complexity is linear with respect to the number of
>iterations, 
It would have been very strange if this was not the case.  You are
attempting to experimentaly evaluate the complexity of the algorithm by
solving the same problem different numbers of times.  This just
evaluates the complexity of the loop predicate that you wrote.  In
order to experimentaly find the complexity of the algorithm you need to
run it with lists of increasing sizes as input.

>In my appliaction, the size of S is averaged
>around 20,000 - 50,000 elements and each element is of size 5 - 10.
>The number of iterations is averaged around 20,000. I guess you can
>see why I am not too pleased with the result.
It looks like you need to look very carefuly at the implementation of
the algorithm.  You probably need to structure your data in such a way,
so that the operations you perform on it (e.g. test for membership)
execute very fast.  After you have implemented the algorithm in the
best possible way, it will help if you profile it in order to find out
where it spends most of the time, and improve those places even more.
It may be that the data structures that are best suited for a good
implementation of the algorithm can not be efficiently represented in
Prolog, but that is not a problem of the Prolog implementation.

>Before I put all the blame on Prolog, I like to admit that I did not
>perform a comparative test on other languages. 
Here is the exactly same algorithm expressed in C.  Notice that the
same data structures are being used:

bool
is_subset_2(struct s_Term *i0, struct s_Term *i1)
{
	if ( i0->type == et_Null) {
		return TRUE;
	}
	if ( i0->type == et_Dot) {
		if ( member_2(i0->t[0], i1) && is_subset_2(i0->t[1], i1)) {
			return TRUE;
		}
	}
	return FALSE;
}

bool
is_subset_of_member_2(struct s_Term *i0, struct s_Term *i1)
{
	if ( i1->type == et_Dot) {
		if ( fct_not_equals_2(i0, i1->t[0]) && is_subset_2(i1->t[0], i0)) {
			return TRUE;
		}
	}
	if ( i1->type == et_Dot) {
		if ( is_subset_of_member_2(i0, i1->t[1])) {
			return TRUE;
		}
	}
	return FALSE;
}

bool
subsume_3(struct s_Term *i0, struct s_Term *i1, struct s_Term **o0)
{
	if ( i1->type == et_Null) {
		*o0 = mkterm(0, et_Null);
		return TRUE;
	}
	if ( i1->type == et_Dot) {
		struct s_Term *v0;
		if ( ! is_subset_of_member_2(i1->t[0], i0) && subsume_3(i0, i1->t[1], &v0)) {
			*o0 = mkterm(2, et_Dot, i1->t[0], v0);
			return TRUE;
		}
	}
	if ( i1->type == et_Dot) {
		struct s_Term *v0;
		if ( subsume_3(i0, i1->t[1], &v0)) {
			*o0 = v0;
			return TRUE;
		}
	}
	return FALSE;
}

bool
sub_2(struct s_Term *i0, struct s_Term **o0)
{
	struct s_Term *v0;
	if ( subsume_3(i0, i0, &v0)) {
		*o0 = v0;
		return TRUE;
	}
	return FALSE;
}

Guess what?  It takes 2 seconds on a Sun 4 to execute loop(1000).  So
the language has nothing to do with your problem.  In fact it seems
that you are using a very fast implementation of Prolog, or a very fast
Sun 4.

>It may well be the case that the problem of
>subsumption requires smarter encoding of sets before any significant
>improvement in speed can be achieved.
Right.

Diomidis
-- 
Diomidis Spinellis                  Internet:                 dds@doc.ic.ac.uk
Department of Computing             UUCP:                    ...!ukc!icdoc!dds
Imperial College, London SW7        #include "/dev/tty"

Newsgroup comp.lang.prolog contents
Newsgroup list
Diomidis Spinellis home page

Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-Share Alike 3.0 Greece License.