Why Triangular Membership Functions Are Successfully Used in F-Transform Applications: A Global Explanation to Supplement the Existing Local Ones

Kosheleva, Olga; Kreinovich, Vladik; Nguyen, Thach Ngoc

doi:10.3390/axioms8030095

Open AccessArticle

Why Triangular Membership Functions Are Successfully Used in F-Transform Applications: A Global Explanation to Supplement the Existing Local Ones

by

Olga Kosheleva

¹

,

Vladik Kreinovich

^2,*

and

Thach Ngoc Nguyen

³

¹

Department of Teacher Education, University of Texas at El Paso, El Paso, TX 79968, USA

²

Department of Computer Science, University of Texas at El Paso, El Paso, TX 79968, USA

³

Institute of Banking Research and Technology, Banking University of Ho Chi Minh City, 56 Hoang Dieu 2, Thu Duc, Ho Chi Minh City 71010, Vietnam

^*

Author to whom correspondence should be addressed.

Axioms 2019, 8(3), 95; https://0-doi-org.brum.beds.ac.uk/10.3390/axioms8030095

Submission received: 22 April 2019 / Revised: 1 August 2019 / Accepted: 2 August 2019 / Published: 5 August 2019

(This article belongs to the Special Issue Fuzzy Transforms and Their Applications)

Download Versions Notes

Abstract

:

The main ideas of F-transform came from representing expert rules. It would be therefore reasonable to expect that the more accurately the membership functions describe human reasoning, the more successful will be the corresponding F-transform formulas. We know that an adequate description of our reasoning corresponds to complicated membership functions—however, somewhat surprisingly, many successful applications of F-transform use the simplest possible triangular membership functions. There exist some explanations for this phenomenon, which are based on local behavior of the signal. In this paper, we supplement these local explanations by a global one: namely, we prove that triangular membership functions are the only one that provide the exact reconstruction of the appropriate global characteristic of the signal.

Keywords:

F-transform; triangular membership function; optimal global characteristic

1. Formulation of the Problem

1.1. F-Transforms: A Brief Reminder

In many application areas, it turned to be very useful to transform the original signal

x (t)

into the values proportional to

x_{i} = \int A (\frac{t - t_{i}}{h}) \cdot x (t) d t,

(1)

where

t_{i} = t_{0} + i \cdot h

for appropriate

t_{0}

and

h > 0

, and

A (t)

is a non-negative function:

which is equal to 0 outside the interval $[- 1, 1]$ ,
which, starting at $t = - 1$ , increases to 1 until it reaches $t = 0$ ,
which then decreases to 0, and
for which

$\sum_{i} A (\frac{t - t_{i}}{h}) = 1$

(2)

for all t; this last property is known as the fuzzy partition property.

This transformation is known as F-transform; see, e.g., [1,2,3,4,5,6].

This transform comes from the general fuzzy approach (see, e.g., [7,8,9,10,11,12]), namely, from the idea of describing imprecise (fuzzy) expert knowledge of the type “if t is close to

t_{i}

, then

x (t)

is close to

x (t_{i})

”. From this viewpoint, the function

A (t)

is a membership function that corresponds to the word “close”.

1.2. A Somewhat Unexpected Empirical Fact

Intuitively, one would expect that the closer the function

A (t)

to how we actually think, the more successful would be the results. Empirical studies show that rather complex membership functions are needed to represent our reasoning; see, e.g., [11]. However, surprisingly, in many of these applications, many good results are obtained when researchers use a very simple triangular membership function

A (t) = 1 - | t |

. Why?

Comment

It is worth mentioning that triangular membership functions are successfully used in other applications of fuzzy techniques; for example:

in fuzzy control; see, e.g., an application to control of telerobots in space medicine [13];
in information accessing systems such as information retrieval systems, filtering systems, recommender systems, and web quality evaluation tools; see, e.g., [14] and references therein, etc.

Piecewise linear functions—similar to triangular membership functions—are also effectively used in neural networks; see, e.g., [15,16,17].

1.3. It Is Desirable to Have Theoretical Explanations for This Empirical Fact

We strongly believe that every time there is an unexplained empirical fact about data processing algorithms, it is desirable to come up with a theoretical explanation. Such an explanation makes the resulting algorithms more reliable, thus decreasing the possibility that these algorithms will fail and, correspondingly, increasing the chances that these efficient algorithms will be used by practitioners, even in potentially high-risk situations. Sometimes, the corresponding analysis finds conditions under which these methods work successfully, and even helps develop even more efficient techniques.

Moreover, since each theoretical justification increases our confidence in the corresponding method, the more different explanations we have, the higher our confidence. Thus, it is better to have as many different theoretical explanations as possible.

From this viewpoint, it is desirable to have as many theoretical explanations for possible for the above empirical fact—that triangular membership functions, in spite of them being different from what we elicit from the experts, have been successfully used in many applications of F-transforms (and in other applications of fuzzy techniques).

1.4. How This Empirical Fact Is Explained Now

At present, there are several theoretical applications for the success of triangular membership functions.

In [18,19,20], it is shown that these membership functions are the most robust—in the sense that a given change in the input x to

x^{'} \approx x

leads to the smallest possible change in the value

A (x)

of the corresponding membership function; specifically:

In [20], we describe this requirement in crisp terms—as minimizing the difference between the values of $A (x)$ and $A (x^{'})$ .
In contrast, in [18,19], we consider this requirement in fuzzy terms—as the requirement that the degree of closeness between $A (x)$ and $A (x^{'})$ should be the largest possible. In [18], this is done with type-1 fuzzy techniques, and in [19], with type-2 fuzzy techniques.

In [21], an alternative explanation was proposed: namely, it was shown that the triangular membership functions are the least vulnerable to noise.

It should be mentioned that these explanations are “local” in the sense that they consider:

either values $A (x)$ at nearby points x
or the effect of noise—which is also added locally, for each x separately.

1.5. What We Do in This Paper

In this paper, we provide an alternative “global” explanation for the successful use of triangular membership functions: namely, we show that the triangular membership functions are the only ones whose use enables us to correctly reconstruct an appropriate global characteristic. For this purpose:

first, we consider the selection of an appropriate global characteristic, solve the resulting optimization problem, and thus find the global characteristic that is optimal in some reasonable sense;
second, we prove that the triangular membership functions are the only ones that allow us to uniquely reconstruct the optimal global characteristic of the original signal.

These two results are new. The second result is the main contributions of this paper. The first result is auxiliary for this paper— but it may be useful for other applications.

1.6. The Structure of the Paper

We want to find all membership functions that allow us to exactly reconstruct the most adequate global characteristic. This idea is described in Section 2. To find the resulting membership functions, we first describe which characteristics to consider and which characteristics are the most adequate. In Section 3, we provide a general description of what we mean by a global characteristic. In Section 4, we use a general description of optimization problems to provide a natural formalization for what it means for characteristics to be more (or less) adequate than others.

The resulting optimization problem is then solved in Section 5, where we describe the global characteristics which are optimal with respect to the corresponding optimality criteria. Finally, Section 6 contains the main result of this paper: that only triangular membership functions enable us to uniquely reconstruct the most adequate global characteristic of the original signal from its F-transform.

2. Local vs. Global Characteristics: Main Idea

2.1. What We Mean by Local and Global Characteristics

No measuring instrument can provide an instantaneous value of a physical quantity. No matter at what time t we perform our measurement, the measurement result depends not only on the value of the signal

x (t)

at this moment of time, but also on the values

x (s)

at nearby moments of time.

In some cases, we are interested in the local behavior of the signal. In this case, we try to measure values which are as close to

x (t)

as possible. F-transform values are an example of such a local analysis.

In other cases, we are interested in the global trend. In such cases, instead of concentrating on a short-term time interval, we deliberately measure the signal over a long period of time.

2.2. Resulting Idea

To most adequately reconstruct the signal, we should be able to adequately reproduce both its local and its global characteristics. By definition, F-transform adequately represents the local characteristics, no matter what membership function

A (t)

we select. Thus, it is reasonable to select a membership function which most adequately represents global characteristics.

Let us describe this idea in precise terms.

3. Which Global Characteristics Should We Represent: Discussion

3.1. Need for Linearization

Signals are usually weak. Thus, for any quantity q that depends on this signal

x (t)

—be it local or global—we should be able to ignore terms which are quadratic or higher order in terms of

x (t)

and thus retains only the linear terms in the corresponding dependence. As a result, we should only consider linear quantities, i.e., quantities of the type

q = \int q (t) \cdot x (t) d t .

(3)

3.2. Which Linear Quantities Should We Select?

Of course, when we perform F-transform, we lose some information about the signal. Indeed, on each time interval, we replace infinitely many values

x (t)

corresponding to infinitely many moments of time t from this interval, with finite many values of the corresponding F-transform. Thus, we cannot perfectly reconstruct all possible global characteristics q, since from the values of all these characteristics (e.g., of the integrals

\int_{- \infty}^{t} x (s) d s

) we would be able to uniquely reconstruct all the values

x (t)

.

Thus, we need to select the most appropriate global characteristics.

3.3. How to Define What Is Most Appropriate?

In different situations, different global characteristics may be more appropriate. In this paper, instead of trying to list specific notions of appropriateness, we will consider all possible optimality criteria of this type.

Interestingly, it turns out that all reasonable optimality criteria of this type lead, in effect, to the same family of optimal global characteristics—and the only way to reconstruct these characteristics exactly is to use triangular membership functions.

Let us describe all this in precise terms.

4. Selecting the Most Adequate Global Characteristic: Towards Precise Formulation of the Problem

4.1. Towards Describing What Is More Appropriate and What Is Less Appropriate

As we have mentioned, all global characteristics have the form

q = \int q (t) \cdot x (t) d t

. Thus, selecting a characteristic is equivalent to selecting the corresponding function

q (t)

.

This function

q (t)

may be discontinuous, as in the above example of a characteristic

\int_{- \infty}^{t} x (s) d s

. However, at least it should be measurable (non-measurable functions cannot be defined without using the Axiom of Choice, which means that they are not definable).

Of course, if we can reconstruct the value

\int q (t) \cdot x (t) d t

, then, for every real value c, we can also reconstruct the related value

\int (c \cdot q (t)) \cdot x (t) d t

, since this related value is simply equal to

c \cdot \int q (t) \cdot x (t) d t .

(4)

Thus, strictly speaking, a characteristic is represented not by a single function, but by the entire family

{c \cdot q (t)}_{c \neq 0}

of the related functions. Thus, we arrive at the following definition.

Definition 1.

By a characteristic or, alternatively, a family, we mean a family of the type

{c \cdot q (t)}_{c \neq 0}

, where

q (t)

is a given measurable function, and c runs over all possible non-zero real numbers.

4.2. Discussion

To describe which families are most appropriate—i.e., to formulate the corresponding optimization problem—we need to describe, in precise terms, what it means that some characteristic (family) is more appropriate than the other one.

To come up with such a formalization, let us recall how optimization problems are (and can be) described in general. In the general case, we have a set of alternatives

A

, and for alternatives from this set, we have defined:

what it means for the alternative a to be better than the alternative b (we will denote it by $a ≺ b$ ), and
what it means for the alternatives a and b to be of the same quality (we will denote it by $a \sim b$ ).

In these general terms, an alternative

a_{opt}

is optimal if it is better than or equivalent to any other alternatives, i.e., if for every alternative

a \in A

, we have either

a_{opt} < a

or

a_{opt} \sim a

.

In the traditional formulation of an optimization problem, we have an objective function

f (a)

defined on the set

A

of all possible alternatives. We want to minimize the value of this objective function. (Sometimes, we want to maximize it; this case can be treated similarly.) In this case, the above relations

a ≺ b

and

a \sim b

can be naturally defined in terms of this objective function, namely:

we have $a ≺ b$ if and only if $f (a) < f (b)$ ; and
we have $a \sim b$ if and only if $f (a) = f (b)$ .

In this formulation, an alternative

a_{opt}

is optimal if it attains the smallest possible value of the objective function, i.e., if

f (a_{opt}) \leq f (a)

for all

a \in A

.

In some cases, there are several different alternatives which are all optimal in the sense of the given criterion. For example, if the set

A

contains algorithms for solving a certain class of problems, and

f (a)

is the average approximation error of the solution, we may have several different methods that guaranteed the same smallest possible average approximation error.

In such cases, we can use this uncertainty to optimize some other objective function

g (a)

. For example, in the algorithms case, we can select, among all the algorithms with the smallest possible average approximation error

f (a)

, the one for which, e.g., the worst-case approximation error is the smallest possible.

In such situations, the alternative a is better than the alternative b if:

either a is better than b with respect to the original optimality criterion, i.e.,

$f (a) < f (b),$

(5)
or with respect to the original optimality criterion, the alternatives a and b are of equal quality, but the alternative a is better with respect to the second objective function, i.e., if

$f (a) = f (b) and g (a) < g (b) .$

(6)

In other words, in such situations, we can have a more complex way of describing when an alternative a is better than the alternative b:

we have $a ≺ b$ if and only if

$f (a) < f (b) or (f (a) = f (b) and g (a) < g (b));$

(7)

and
we have $a \sim b$ if and only if

$f (a) = f (b) and g (a) = g (b) .$

(8)

Even with this modified optimality criterion, we can still have several equally optimal alternatives. For example, we may have several different algorithms which have the same smallest possible average approximation error and the same smallest possible worst-case approximation error. In such cases, we can use this non-uniqueness to optimize yet another objective function

h (a)

—e.g., the average computation time, etc.

How can we describe such general optimization settings in precise terms? In all these cases, what is important is which alternatives are better than others and which are of the same quality. In other words, we need some criterion according to which, for every two alternatives a and b, we can say one of the three things:

we can say that a is better than b in the sense of this criterion; we will denote this by

$a ≺ b;$

(9)
we can say that b is better than a in the sense of the given criterion; we will denote this by

$b ≺ a;$

(10)
or we can say that the two alternatives are equally good with respect to the given criterion; we will denote this by $a \sim b$ .

No matter what the optimality criterion is, we have these relations. Thus, we can simply make these relations the definition of an optimality criterion.

Of course, we need to make sure that these relations are consistent: e.g., if a is better than b and b is better than c, then a should be better than c. Thus, we arrive at the following definition.

Definition 2

(see, e.g., [22]). Let

A

be a set; its elements will called alternatives. By an optimality criterion, we mean a pair of relations

〈 ≺, \sim 〉

of the set A that satisfies the following properties:

for every two alternatives a and b, we have one and only one of three options: $a ≺ b$ , $b ≺ a$ , and $a \sim b$ ;
if $a ≺ b$ and $b ≺ c$ , then $a ≺ c$ ;
if $a ≺ b$ and $b \sim c$ , then $a ≺ c$ ;
if $a \sim b$ and $b ≺ c$ , then $a ≺ c$ ;
if $a \sim b$ and $b \sim c$ , then $a \sim c$ ;
$a \sim a$ , and
if $a \sim b$ , then $b \sim a$ .

In these terms, an alternative is optimal if it is better than or of the same quality as any other alternative:

Definition 3.

We say that a characteristic

a_{opt}

is optimal with respect to the optimality criterion

〈 ≺, \sim 〉

if for every alternative

a \in A

, we have

a_{opt} ≺ a

or

a_{opt} \sim a

.

4.3. Discussion

From the purely mathematical viewpoint, these conditions may be all we need; however, as we will show, from the practical viewpoint, we need to impose some additional requirement on the corresponding pair of relations.

Indeed, the whole purpose of selecting an optimality criterion is to use this optimality criterion for selecting the best alternative, i.e., the alternative which is better—according to this criterion— than any other alternative.

Thus, if for some pair of relations

〈 ≺, \sim 〉

, there is no such optimal alternative, the corresponding “optimality criterion” is, from the practical viewpoint, completely useless. Thus, we need to require that there should be at least one optimal alternative.

What if there are several characteristics which are all the most appropriate according to the given criterion? In this case, as we have mentioned earlier, we can use this non-uniqueness to optimize something else. For example, if several characteristics are equally good in terms of accuracy with which we can predict the future behavior of the signal, then we can select among them the characteristic which is the easiest to compute. As a result, we get, in effect, a new optimality criterion, according to which a is better than b if:

either a is better than b according to the original optimality criterion,
or a is equivalent to b in terms of the original optimality criterion but better according to the additional optimality criterion.

If, for the new optimality criterion, we still have several different optimal alternatives, we can then optimize something else, etc., until we reach a final optimality criterion for which there is exactly one optimal alternative.

Definition 4.

We say that an optimality criterion is final if there exists exactly one alternative which is optimal with respect to this criterion.

4.4. Need for Scale-Invariance

Let us apply the above general formulation to the problem of selecting the most adequate global characteristic—or, to be more precise, for the problem of selecting the most adequate family

F = {c \cdot q (t)}_{c \neq 0}

of such characteristics.

In our problem, we deal with signals

x (t)

that describe how the value of a physical quantity x depends on time. We may have a starting point for the corresponding process, which provides a natural starting point for measuring time, but, in general, the numerical value of time depends on what unit we use for measuring time. We can use seconds or minutes or hours—the time interval will be the same, but the numerical values will change.

When we replace the original unit for measuring time with a new unit which is

λ

times smaller, then all numerical values of time are re-scaled, i.e., multiplied by

λ

. For example, if we go from seconds to milliseconds, all numerical values are multiplies by 1000. The function

q (t)

in the new units becomes

q (λ \cdot t)

.

It is reasonable to require that the relative quality of different characteristics should not change if we simply change the unit used for measuring time, without changing anything of substance. In other words, it is reasonable to require that the criterion be “scale-invariant”. Here is a precise definition.

Definition 5.

We say that an optimality criterion

〈 ≺, \sim 〉

on the set of all possible families (in the sense of Definition 1) is scale-invariant if, for every two functions

q (t)

and

r (t)

and for every

λ > 0

, the following two conditions hold:

if ${c \cdot q (t)}_{c} ≺ {c \cdot r (t)}_{c}$ , then ${c \cdot q (λ \cdot t)}_{c} ≺ {c \cdot r (λ \cdot t)}_{c}$ ;
if ${c \cdot q (t)}_{c} \sim {c \cdot r (t)}_{c}$ , then ${c \cdot q (λ \cdot t)}_{c} \sim {c \cdot r (λ \cdot t)}_{c}$ .

5. Which Characteristics Are the Most Adequate: Auxiliary Result

Discussion

In the previous section, we argued that the most adequate global characteristic must be optimal with respect to some final scale-invariant optimality criterion. Let us describe all such characteristics.

Proposition 1.

For every final scale-invariant optimality criterion, each optimal characteristic has the form

{c \cdot x^{β}}_{c}

, for some real value β.

Comment

Note that we did not select any specific optimality criterion. We could select an optimality criterion and find out which characteristic is better for this particular criterion. However, there are many possible optimality criteria. Thus, if we restrict ourselves to single optimality criteria, we would need to repeatedly solve the corresponding optimization problem for all these different optimality criteria.

Instead, Proposition 1 provides a general result that covers all (reasonable) optimality criteria. According to this result, no matter what optimality criterion we choose, the optimal characteristic always has the form

c \cdot x^{β}

. For different optimality criteria, we may have different

β

, but the form is always the same.

Proof of Proposition 1.

Let us denote the scaling transformation that transforms a family

F = {c \cdot q (t)}_{c}

into a re-scaled family

{c \cdot q (λ \cdot t)}_{c}

by

T_{λ}

. In terms of this notation, scale-invariance means that:

if $F ≺ G$ , then $T_{λ} (F) ≺ T_{λ} (G)$ ; and
if $F \sim G$ , then $T_{λ} (F) \sim T_{λ} (G)$ .

Let

〈 ≺, \sim 〉

be the final scale-invariant criterion. Since this criterion is final, there exists exactly one optimal characteristic

F_{opt}

. Let us prove that this characteristic is scale-invariant, i.e., that

T_{λ} (F_{opt}) = F_{opt}

for all

λ > 0

. (This proof is similar to the one given in [22].)

Indeed, since

F_{opt}

is optimal, it is better than or equivalent to any other characteristic. In particular, for every G, the characteristic

F_{opt}

is better than or equivalent to

T_{1 / λ} (G)

:

T_{1 / λ} (G) ≺ F_{opt}

or

T_{1 / λ} (G) \sim F_{opt}

. By applying scale-invariance, we conclude that either

T_{λ} (T_{1 / λ} (G)) ≺ T_{λ} (F_{opt})

or

T_{λ} (T_{1 / λ} (G)) \sim T_{λ} (F_{opt})

. However, one can easily check that

T_{λ} (T_{1 / λ} (G)) = G

.

Thus, for every characteristic G, we have either

G ≺ T_{λ} (F_{opt})

or

G \sim T_{λ} (F_{opt})

. By definition of an optimal characteristic, this means that the characteristic

T_{λ} (F_{opt})

is optimal. However, for the final criterion, there is only one optimal characteristic, so we conclude that

T_{λ} (F_{opt}) = F_{opt}

. Thus, the optimal characteristic is indeed scale-invariant.

By definition, each characteristic has the form

{c \cdot q (t)}_{c}

. Let us denote the function

q (t)

corresponding to the optimal characteristic by

q_{opt} (t)

. The fact that the optimal family is scale-invariant means, in particular that, for every

λ > 0

, the function

q_{opt} (λ \cdot t)

—which belongs to the re-scaled family

T_{λ} (F_{opt})

— also belongs to the original family, i.e., has the form

c (λ) \cdot q_{opt} (t)

for some value

c (λ)

:

q_{opt} (λ \cdot t) = c (λ) \cdot q_{opt} (t)

. It is known that the only measurable functions satisfying this functional equation are functions of the type

C \cdot t^{β}

; see, e.g., [23].

The proposition is proven. □

6. Main Result: A New Justification of Triangular Membership Functions

Now that we know which global characteristics are the most adequate, let us now find out which membership functions can allow us to uniquely reconstruct the most adequate characteristic of a signal from its F-transform. We will start with a precise definition.

Definition 6.

We say that, for a membership function

A (t)

, it is possible to always reconstruct a global characteristic

\int q (t) \cdot x (t) d t

if for every

t_{0}

and h, the value of this characteristic can be uniquely determined once we know all the values

x_{i} = \int A (\frac{t - t_{i}}{h}) \cdot x (t) d t,

(11)

where

t_{i} \overset{def}{=} t_{0} + i \cdot h

.

6.1. Case of $β = 0$

A particular case of the most adequate global characteristic is the case

β = 0

, when

q (t) = const

and the corresponding global characteristic is simply the integral

\int x (t) d t = 1

. This characteristic can always be reconstructed from the F-transform, since we require that

\sum_{i} A (\frac{t - t_{i}}{h}) = 1

for all t and thus

\int x (t) d t = \sum_{i} \int A (\frac{t - t_{i}}{h}) \cdot x (t) d t = \sum_{i} x_{i} .

(12)

6.2. General Case

Thus, we should worry only about the case when

β \neq 0

. In this case, we have the following result.

Proposition 2.

The only membership function

A (t)

for which it is possible to always reconstruct a most adequate global characteristic with

β \neq 0

is the triangular membership function—it can reconstruct the characteristic

\int t \cdot x (t) d t

corresponding to

β = 1

.

Comment

This result provides the desired global explanation of why triangular membership functions work so well in F-transform applications.

Proof of Proposition 2.

Let us assume that, for some

β \neq 0

, the membership function

A (t)

enables us to always uniquely reconstruct the corresponding characteristic

\int t^{β} \cdot x (t) d t .

(13)

Let us first consider the case when

t_{0} = 0

,

h = 1

, and the signal

x (t)

is equal to 0 everywhere except for the interval

[0, 1]

. Then, only two F-transform values are different from 0:

the value $x_{0} = \int_{0}^{1} A (t) \cdot x (t) d t$ , and
the value $x_{1} = \int_{0}^{1} A (t - 1) \cdot x (t) d t$ .

The fuzzy partition requirement implies that

A (t) + A (t - 1) = 1

, so

A (t - 1) = 1 - A (t) .

(14)

The only way to be able to always reconstruct the value

\int_{0}^{1} t^{β} \cdot x (t) d t

from these two values, no matter how the signal

x (t)

behaves on the interval

[0, 1]

, is to have

t^{β}

equal to a linear combination of

A (t)

and

A (t - 1) = 1 - A (t)

. Thus, the function

t^{β}

is a linear combination of

A (t)

and 1, and, hence,

A (t)

is a linear combination of

t^{β}

and 1, i.e.,

A (t) = a + b \cdot t^{β}

.

For

t = 1

, we must have

A (t) = 0

, so

a + b = 0

and thus

A (t) = a \cdot (1 - t^{β})

. For

t = 0

, we must have

A (0) = 1

, so we have

a = 1

and

A (t) = 1 - t^{β}

for

t \in [0, 1]

. Correspondingly, for

s \in [- 1, 0]

, due to

A (t - 1) = 1 - A (t)

, we have

A (s) = 1 - A (s + 1) = {(s + 1)}^{β} .

(15)

Let us now consider a signal which is different from 0 only on the interval

[1, 2]

. For this signal, the desired global characteristic has the form

\int_{1}^{2} t^{β} \cdot x (t) d t

, and the only non-zero values of F-transform are

x_{1} = \int_{1}^{2} (1 - {(t - 1)}^{β}) \cdot x (t) d t

and

x_{2} = \int_{1}^{2} {(t - 1)}^{β} \cdot x (t) d t

. Thus, the only way to exactly reconstruct the global characteristic is to have

t^{β}

to be a linear combination of

1 - {(t - 1)}^{β}

and

{(t - 1)}^{β}

, i.e., as a linear combination of

{(t - 1)}^{β}

and 1:

t^{β} = a \cdot {(t - 1)}^{β} + b

.

Let us show that

β = 1

. For this, we need to show that cases when

β > 1

and

β < 1

are impossible.

Indeed, differentiating both sides by t, we get

β \cdot t^{β - 1} = a \cdot β \cdot {(t - 1)}^{β - 1} .

If

β > 1

, then for

t = 1

, we get

β = 0

, which contradicts the assumption that

β > 1

. If

β < 1

, then for

t = 1

, we get

β = \infty

—also a contradiction.

Thus,

β = 1

, so

A (t) = 1 - | t |

, i.e., we indeed have a triangular membership function. The proposition is proven. □

Comment

Once we have a triangular membership function, it is easy to combine the F-transform values to get an integral of a linear function. For simplicity, assume that we start with the signal which is 0 for

t < 0

, and that

h = 1

. Then, the values

x (t)

corresponding to

t \in [0, 1]

, affect the value

x_{0}

, with the weight

1 - t

, and the value

x_{1}

, with weight t. If we take the difference

x_{1} - x_{0}

, this difference corresponds to the weight

2 t - 1

on

[0, 1]

(and the weight

2 - x

for

x \in [1, 2]

).

We can normalize the difference

x_{1} - x_{0}

to get the coefficient at t on

[0, 1]

to be equal to 1. For the resulting normalized linear combination

\frac{1}{2} \cdot (x_{1} - x_{0})

, on

[0, 1]

, we have the weight

t - \frac{1}{2}

, and on

[1, 2]

, the weight

1 - \frac{t}{2}

.

On the interval

[1, 2]

, the next F-transform value

x_{2}

corresponds to the coefficient

t - 1

(and 0 before that). Thus, by adding

x_{2}

with the appropriate coefficient, we can make sure that the linear combination continues to have t with coefficient 1 on the interval

[1, 2]

as well. For that, we need to add

x_{2}

with coefficient

\frac{3}{2}

. Then, the resulting linear combination

\frac{1}{2} \cdot (x_{1} - x_{0}) + \frac{3}{2} \cdot x_{2}

is equal to

t - \frac{1}{2}

on the whole interval

[0, 2]

.

On

[2, 3]

, this combination is equal to

\frac{3}{2} \cdot (3 - t)

. Thus, to make sure that we get a linear combination which is equal to

t - \frac{1}{2}

on the interval

[2, 3]

as well, we need to add

x_{3}

with coefficient

\frac{5}{2}

, etc. At the end, when we reach the end of the time interval on which the signal is defined, the corresponding linear combination gives us the integral

\int (t - \frac{1}{2}) \cdot x (t) d t = \int t \cdot x (t) d t - \frac{1}{2} \cdot \int x (t) d t .

(16)

Since, as we have mentioned, we can easily determine the integral

\int x (t) d t

by adding all the values of the F-transform, we can thus indeed determine the value of the desired global characteristic

\int t \cdot x (t) d t

.

Author Contributions

The authors contributed equally to this work. All three authors participated in formulating the problem in precise terms (the effort led by T.N.N.), proving the results (the effort led by O.K.), and editing the resulting paper (the effort led by V.K.).

Funding

This work was supported by the Banking University of Ho Chi Minh City, Vietnam, and by the US National Science Foundation via grants 1623190 (A Model of Change for Preparing a New Generation for Professional Practice in Computer Science) and HRD-1242122 (Cyber-ShARE Center of Excellence).

Acknowledgments

The authors are thankful to Irina Perfilieva for encouragement and valuable discussions, and to the anonymous referees for very useful suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Novak, V.; Perfilieva, I.; Holcapek, M.; Kreinovich, V. Filtering out high frequencies in time series using F-transform. Inf. Sci. 2014, 274, 192–209. [Google Scholar] [CrossRef] [Green Version]
Novak, V.; Perfilieva, I.; Kreinovich, V. F-transform in the analysis of periodic signals. In Proceedings of the 15th Czech-Japan Seminar on Data Analysis and Decision Making under Uncertainty CJS’2012, Osaka, Japan, 24–27 September 2012. [Google Scholar]
Perfilieva, I. Fuzzy transforms: Theory and applications. Fuzzy Sets Syst. 2006, 157, 993–1023. [Google Scholar] [CrossRef]
Perfilieva, I. F-transform. In Springer Handbook of Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2015; pp. 113–130. [Google Scholar]
Perfilieva, I.; Danková, M.; Bede, B. Towards a higher degree F-transform. Fuzzy Sets Syst. 2011, 180, 3–19. [Google Scholar] [CrossRef]
Perfilieva, I.; Kreinovich, V.; Novak, V. F-transform in view of trend extraction. In Proceedings of the 15th Czech-Japan Seminar on Data Analysis and Decision Making under Uncertainty CJS’2012, Osaka, Japan, 24–27 September 2012. [Google Scholar]
Belohlavek, R.; Dauben, J.W.; Klir, G.J. Fuzzy Logic and Mathematics: A Historical Perspective; Oxford University Press: New York, NY, USA, 2017. [Google Scholar]
Klir, G.; Yuan, B. Fuzzy Sets and Fuzzy Logic; Prentice Hall: Upper Saddle River, NJ, USA, 1995. [Google Scholar]
Mendel, J.M. Uncertain Rule-Based Fuzzy Systems: Introduction and New Directions; Springer: Cham, Switzerland, 2017. [Google Scholar]
Nguyen, H.T.; Walker, C.; Walker, E.A. A First, Course in Fuzzy Logic; Chapman and Hall/CRC: Boca Raton, FL, USA, 2019. [Google Scholar]
Novák, V.; Perfilieva, I.; Močkoř, J. Mathematical Principles of Fuzzy Logic; Kluwer: Dordrecht, The Netherlands, 1999. [Google Scholar]
Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef] [Green Version]
Haidegger, T.; Kovácz, L.; Precup, R.-E.; Benyó, B.; Benyó, Z.; Preitl, S. Simulation and control in telerobots in space medicine. Acta Astronaut. 2012, 81, 390–402. [Google Scholar] [CrossRef]
Herrera-Viedma, E.; López-Herrera, A.G. A review on information accessing systems based on fuzzy linguistic modeling. Int. J. Comput. Intell. Syst. 2010, 3, 420–437. [Google Scholar] [CrossRef]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Ruiz-Rangel, J.; Ardila Hernandez, C.J.; Maradei Gonzalez, L.; Jabba Molinares, D. ERNEAD: Training of artificial neural networks based on a genetic algorithm and finite automata theory. Int. J. Artif. Intell. 2018, 16, 214–253. [Google Scholar]
Saadat, J.; Moallem, P.; Koofigar, H. Training echo estate neural network using harmony search algorithm. Int. J. Artif. Intell. 2017, 15, 163–179. [Google Scholar]
Gholamy, A.; Kosheleva, O.; Kreinovich, V. How to explain the efficiency of triangular and trapezoid membership functions in applications to design. Ontol. Des. 2019, 9, 253–260. [Google Scholar] [CrossRef]
Kosheleva, O.; Kreinovich, V.; Shahbazova, S.N. Type-2 fuzzy analysis explains ubiquity of triangular and trapezoid membership functions. In Proceedings of the World Conference on Soft Computing, Baku, Azerbaijan, 29–31 May 2018. [Google Scholar]
Kreinovich, V.; Kosheleva, O.; Shahbazova, S.N. Why triangular and trapezoid membership functions: A simple explanation. In Proceedings of the World Conference on Soft Computing, Baku, Azerbaijan, 29–31 May 2018. [Google Scholar]
Kosheleva, O.; Kreinovich, V. Why triangular membership functions are often efficient in F-transform applications: Relation to probabilistic and interval uncertainty and to Haar wavelets. In Proceedings of the 17th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU’2018, Cadiz, Spain, 11–15 June 2018. [Google Scholar]
Nguyen, H.T.; Kreinovich, V. Applications of Continuous Mathematics to Computer Science; Kluwer: Dordrecht, The Netherlands, 1997. [Google Scholar]
Aczél, J.; Dhombres, J. Functional Equations in Several Variables; Camridge University Press: Cambridge, UK, 2008. [Google Scholar]

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kosheleva, O.; Kreinovich, V.; Nguyen, T.N. Why Triangular Membership Functions Are Successfully Used in F-Transform Applications: A Global Explanation to Supplement the Existing Local Ones. Axioms 2019, 8, 95. https://0-doi-org.brum.beds.ac.uk/10.3390/axioms8030095

AMA Style

Kosheleva O, Kreinovich V, Nguyen TN. Why Triangular Membership Functions Are Successfully Used in F-Transform Applications: A Global Explanation to Supplement the Existing Local Ones. Axioms. 2019; 8(3):95. https://0-doi-org.brum.beds.ac.uk/10.3390/axioms8030095

Chicago/Turabian Style

Kosheleva, Olga, Vladik Kreinovich, and Thach Ngoc Nguyen. 2019. "Why Triangular Membership Functions Are Successfully Used in F-Transform Applications: A Global Explanation to Supplement the Existing Local Ones" Axioms 8, no. 3: 95. https://0-doi-org.brum.beds.ac.uk/10.3390/axioms8030095

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Why Triangular Membership Functions Are Successfully Used in F-Transform Applications: A Global Explanation to Supplement the Existing Local Ones

Abstract

1. Formulation of the Problem

1.1. F-Transforms: A Brief Reminder

1.2. A Somewhat Unexpected Empirical Fact

1.3. It Is Desirable to Have Theoretical Explanations for This Empirical Fact

1.4. How This Empirical Fact Is Explained Now

1.5. What We Do in This Paper

1.6. The Structure of the Paper

2. Local vs. Global Characteristics: Main Idea

2.1. What We Mean by Local and Global Characteristics

2.2. Resulting Idea

3. Which Global Characteristics Should We Represent: Discussion

3.1. Need for Linearization

3.2. Which Linear Quantities Should We Select?

3.3. How to Define What Is Most Appropriate?

4. Selecting the Most Adequate Global Characteristic: Towards Precise Formulation of the Problem

4.1. Towards Describing What Is More Appropriate and What Is Less Appropriate

4.2. Discussion

4.3. Discussion

4.4. Need for Scale-Invariance

5. Which Characteristics Are the Most Adequate: Auxiliary Result

Discussion

6. Main Result: A New Justification of Triangular Membership Functions

6.1. Case of $β = 0$

6.2. General Case

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Why Triangular Membership Functions Are Successfully Used in F-Transform Applications: A Global Explanation to Supplement the Existing Local Ones

Abstract

1. Formulation of the Problem

1.1. F-Transforms: A Brief Reminder

1.2. A Somewhat Unexpected Empirical Fact

1.3. It Is Desirable to Have Theoretical Explanations for This Empirical Fact

1.4. How This Empirical Fact Is Explained Now

1.5. What We Do in This Paper

1.6. The Structure of the Paper

2. Local vs. Global Characteristics: Main Idea

2.1. What We Mean by Local and Global Characteristics

2.2. Resulting Idea

3. Which Global Characteristics Should We Represent: Discussion

3.1. Need for Linearization

3.2. Which Linear Quantities Should We Select?

3.3. How to Define What Is Most Appropriate?

4. Selecting the Most Adequate Global Characteristic: Towards Precise Formulation of the Problem

4.1. Towards Describing What Is More Appropriate and What Is Less Appropriate

4.2. Discussion

4.3. Discussion

4.4. Need for Scale-Invariance

5. Which Characteristics Are the Most Adequate: Auxiliary Result

Discussion

6. Main Result: A New Justification of Triangular Membership Functions

6.1. Case of β = 0

6.2. General Case

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

6.1. Case of $β = 0$